After learning about the different and weird things dogs have been trained to identify by smell at the airport, Luis asked the question of whether he could do something similar with low-cost gas sensors.
“The purpose of the project is to show that low-cost sensors can be reliable in detecting odours and that they can possibly be used in clinical settings.” Luis tells us. “Testing was done using samples of beer and brewed coffee. A K-Nearest Neighbours (KNN) algorithm was used in MATLAB to create a classification model that was used to predict the aromas of beer and coffee, and was validated using a 10-fold cross validation (k-fold)... A 98 percent classification accuracy was achieved in the testing process.”
Smell test
With only four types of gas sensors, extensive testing and training of the model was required.
“A training data set was created by taking measurements of air, beer, and coffee independently.” Luis explains. “Each sample was taken, on average, for 15 minutes at one second intervals, producing over 900 sample readings per test and the data was exported into CSV files. For classification purposes, an additional column was manually added to label the sample (i.e., coffee, beer, air). The three datasets were imported and combined in MATLAB. This data was used to create a k-nearest neighbour model, k was selected to be 5, this was determined by trial and error. A 10-fold cross-validation was used to validate the model, and a Principal Component Analysis (PCA) was used as an exploratory technique to verify the model and the results, similar to the work shown in past research.
“A test dataset was gathered by taking 17 new samples of two-minute readings at one second intervals to assess the classification model. Each sample was independent of each other (only air, beer, or coffee was measured at a time), and they were manually labelled accordingly, resulting in over 2500 measurements. This data was imported, combined, and randomly rearranged in MATLAB. Using the classification model created from the training dataset, the testing data was classified and the results from the classification model represent 97.7% accuracy.”
A near 98% accuracy is extremely impressive for the three test subjects, and it’s all done on a Raspberry Pi 3.
“Raspberry Pi was introduced to me in the fall of 2020 during one of my university courses,” Luis said. “I quickly realized how easy, efficient, and capable Raspberry Pi boards are.”
It’s a cool, working concept, so we hope to see more like it in the future.