Cong Fang#, Hua-Yao Li#, Long Li, Hu-Yin Su, Jiang Tang, Xiang Bai* and Huan Liu*C. FangSchool of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, ChinaH.-Y. Li, L. Li, H.-Y. Su, J. Tang, H. Liu School of Optical and Electronic Information, Optics Valley Laboratory, Huazhong University of Science and Technology, Wuhan 430074, ChinaX. BaiSchool of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, ChinaE-mail: [email protected]; [email protected]#These authors contributed equally: C. Fang, H.-Y. LiKeywords : electronic nose, all-feature extraction, deep learning, odor recognition, sensor arrayAbstract: An electronic nose (e-nose) mimics the mammalian olfactory system in identifying odors and expands human olfaction boundaries by tracing toxins and explosives. However, existing feature-based odor recognition algorithms rely on domain-specific expertise, which may limit the performance due to information loss during the feature extraction process. Inspired by human olfaction, we propose a smart electronic nose enabled by an all-feature olfactory algorithm (AFOA), whereby all features in a gas sensing cycle of semiconductor gas sensors, including the response, equilibrium, and recovery processes are utilized. Specifically, our method combines one-dimensional convolutional and recurrent neural networks with channel and temporal attention modules to fully utilize complementary global and dynamic information. We further demonstrate that a novel data augmentation method can transform the raw data into a suitable representation for feature extraction. Results show that the e-nose simply comprising of six semiconductor gas sensors achieves superior performances to state-of-the-art methods on the Chinese liquor data. Ablation studies reveal the contribution of each sensor in odor recognition. Therefore, a deep learning-enabled codesign of sensor arrays and recognition algorithms can reduce the heavy demand for a huge amount of highly specialized gas sensors and provide interpretable insights into odor recognition dynamics in an iterative way.1. IntroductionHumans can create their perception of the world through sight, hearing, touch, olfaction and taste. Olfaction is important for the survival of living species and allows living species to be able to identify suitable food, detect dangerous chemical substances, etc. The olfactory system of human is based on a chemical reaction that is more complicated than the physical stimulus in the vision and auditory systems. Odors are complex and always contain various types of gas molecules. For mammalian olfaction, each olfactory receptor cell possesses only one type of odorant receptors, and each receptor can detect a limited number of odorant substances. Our olfactory receptor cells (ORCs) are therefore highly specialized for a few odors. Namely, each odor molecule activates very few odorant receptors, leading to a combinatorial code and forming an “odor pattern”[1]. Based on a large number of receptors and complex neural networks, we can discriminate more than one trillion olfactory stimuli[2].Inspired by the nature of the mammalian olfactory system, a “mode nose” was first introduced for gas identification in 1982[3]. It contains a semiconductor gas sensor array that mimics the function of mammalian ORCs with a pattern recognition algorithm to simulate the operations of the nervous system. The non-specific semiconductor gas sensor detects a certain gas from a change in electrical resistance caused by the reaction between gas molecules and preabsorbed oxygen, thereby having cross sensitivity to a wide variety of odors[4]. In 1987, Kaneyasu et al . from Hitachi in Japan named it an “electronic nose”[5] (e-nose), and e-noses were introduced into many fields in the 1990s[6-8]. The e-nose has shown great potentials for expanding the human sense, especially in recognizing gases with no flavor or low concentrations, which may find wide applications in environmental monitoring, food quality assessment, medical diagnosis, etc.[9-12]. Unfortunately, to date, an e-nose mimics the mammalian olfactory system at a gross level. This is because semiconductor gas sensors are far behind ORCs in specificity, diversity and scale[13]. Therefore, a powerful pattern recognition algorithm to handle complex gas-solid interactions under limited hardware conditions is needed.Generally, traditional feature-based methods are all multistage, including feature extraction, dimensionality reduction, and classification[14-16]. In feature extraction, some bioinspired or manually designed features[17-19] are extracted from the response curves based on a basic understanding of the gas sensing mechanism, which mainly contains equilibrium statuses such as resistance values, response/recovery times, and the maximum derivation of the response times. In dimensionality reduction, variants of principal components analysis (PCA) are often used[20-21]. Finally, existing feature-based methods use unsupervised learning[22-23] and backpropagation artificial neural networks (BP-ANNs) for classification[21, 24-26]. These feature-based methods mainly contain equilibrium statuses while neglecting response and recovery features, which may lead to local optima and information loss, before feeding them to the classifier. Therefore, the features extracted from the whole gas sensing curves, including the response, equilibrium, and recovery processes, can play an important role in odor recognition. Humans have only one-third as many types of olfactory receptors as mice but have superior processing power due to stronger brain connections[27], and some studies have even found that odors can affect cognition[28-29]. Enabled by the power of deep-learning, the focus of this study is on the accurate recognition of various odorant mixtures using only a small number of sensor units combined with an all-feature extraction algorithm in a complex circumstance (uncontrolled temperature and humidity). We hypothesize that all features in a gas sensing cycle of the sensor array can produce more distinguishing and robust features, thus reducing the heavy demand for the quantity and diversity of sensors. Hence, we need a more effective algorithm for application-specific sensing scenarios.Recently, deep learning-based methods have exhibited surprising progress in computer vision, natural language processing, medical imaging, etc.[30]. Unlike feature-based methods that heavily rely on intuition or domain-specific experience, deep learning-based methods attempt to learn high-level semantic features from mass data and jointly optimize feature extractors and classifiers to significantly decrease the burden on users. Introducing deep learning-based methods to e-nose technology can improve performance by learning nonintuitive features with deep learning. In addition, the learned features can also help us understand the principle of gas sensing and odor discrimination. Recently, some researchers have treated multichannel response curves as an image and used two-dimensional convolutional neural networks (CNNs) to extract local features and fully connected (FC) layers for classification in an end-to-end manner[31-32]. Although these methods achieve performance improvements over feature-based methods, they ignore the long-term dependencies in time-series signals of the sensor array and bring nonnegligible computational and memory overhead. Wang et al .[33] proposed a quantitative detection method of mixed gases based on long short-term memory (LSTM). This method heavily relies on domain-specific expertise, as the preprocessed response data to be analyzed are manually designed, which may lead to information loss before they are fed into LSTM. However, deep learning applied to raw data can help to better mine cross selectivity among only a few sensors.To tackle the above issues, we fabricate an e-nose that consists of six different metal-oxide semiconductor (MOS) gas sensors, including SnO2 QDs, SnO2 nanowires, SnO2 nanoparticles (NPs) synthesized by flame spray pyrolysis (FSP), In2O3 QDs, NiO NPs and WO3 QDs. MOS favors the e-nose due to its high response rate, low cost, easy fabrication, and long-term stability. In particular, QDs are critical low-dimensional semiconductor materials[34-35], whose dimensions in three axes are not larger than twice the Exciton Bohr radius. To reduce the heavy demand for the number and diversity of sensors, we use tailored data augmentation to handle all features in a gas sensing cycle and therefore transform the raw curves into different shapes to simplify distinguishing and robust feature mining. Specifically, our method combines one-dimensional CNNs and recurrent neural networks (RNNs) with channel and temporal attention modules to fully utilize complementary global and dynamic information in an end-to-end manner. We also demonstrate the generalization power of this data augmentation process, which can significantly improve the performance of feature-based methods. It is worth noting that the people who performed the measurements were not well trained, i.e., experimental errors were introduced into the data, which is similar to practical application scenarios. By consisting of only six non-specific semiconductor gas sensors, the AFOA-enabled e-nose can discriminate these Chinese liquors with high accuracy. It can also be concluded that QDs are superior to other sensors.2. Methods