Cochleogram-based adventitious sounds classification using convolutional neural networks

  1. Mang, L.D.
  2. Canadas-Quesada, F.J.
  3. Carabias-Orti, J.J.
  4. Combarro, E.F.
  5. Ranilla, J.
  1. 1 Departament of Telecommunication Engineering. University of Jaen, Campus Cientifico-Tecnologico de Linares, Avda. de la Universidad, s/n, Linares (Jaen), 23700, Spain
  2. 2 Department of Computer Science, University of Oviedo, Campus de Gijón s/n, Gijon (Asturias), 33203, Spain
Revista:
Biomedical Signal Processing and Control

ISSN: 1746-8094

Año de publicación: 2023

Volumen: 82

Páginas: 104555

Tipo: Artículo

DOI: 10.1016/J.BSPC.2022.104555 GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: Biomedical Signal Processing and Control

Resumen

Background:The World Health Organization (WHO) establishes as a top priority the early detection of respiratory diseases. This detection could be performed by means of recognizing the presence of acoustic bio-markers (adventitious sounds) from auscultation because it is still the main technique applied in any health center to assess the status of the respiratory system due to its non-invasive, low-cost, easy to apply, fast to diagnose and safe nature.Method:Despite the novel deep learning approaches applied in this biomedical field, there is a notable lack of research that rigorously focuses on different time–frequency representations to determine the most suitable transformation to feed data into Convolutional Neural Network (CNN) architectures. In this paper, we propose the use of the cochleogram, based on modeling the frequency selectivity of the human cochlea, as an improved time–frequency representation to optimize the learning process of a CNN model in the classification of respiratory adventitious sounds. Our proposal is evaluated using the largest and most challenging public database of respiratory sounds.Results:The cochleogram obtains the best binary classification results among the compared methods with an average accuracy of 85.1% in wheezes and 73.8% in crackles, and a competitive performance evaluating a multiclass classification scenario in comparison with other well-known state-of-the-art deep learning models.Conclusion:The cochleogram provides a suitable time–frequency representation since it is able to model respiratory adventitious content more accurately by means of non-uniform spectral resolution and due to its increased robustness to noise and acoustic changes. This fact implies a significant improvement in the learning process of CNN models applied in the classification of respiratory adventitious sounds.

Referencias bibliográficas

  • (2017)
  • (2013)
  • (2014)
  • Cilloniz, (2016), Int. J. Mol. Sci., 17, 10.3390/ijms17122120
  • (2016)
  • Torre, (2015), CA: Cancer J. Clin., 65, pp. 87
  • Torre-Cruz, (2019), Appl. Acoust., 148, pp. 276, 10.1016/j.apacoust.2018.12.035
  • Kok, (2019), pp. 2589
  • Bloom, (2011)
  • Sovijarvi, (2000), Eur. Respir. Rev., 10, pp. 597
  • Salazar, (2012), pp. 175
  • De La Torre Cruz, (2020), Sensors, 20, pp. 2679, 10.3390/s20092679
  • Sovijarvi, (2000), Eur. Respir. Rev., 10, pp. 591
  • Pramono, (2017), PLoS One, 12, 10.1371/journal.pone.0177926
  • (2016)
  • (2016)
  • Kaisia, (1991), Med. Biol. Eng. Comput., 29, pp. 517, 10.1007/BF02442324
  • Zhang, (2015), Technol. Health Care, 23, pp. S489, 10.3233/THC-150986
  • Hadjileontiadis, (1996), pp. 2217
  • Charleston-Villalobos, (2011), Comput. Biol. Med., 41, pp. 473, 10.1016/j.compbiomed.2011.04.009
  • Hadjileontiadis, (1997), IEEE Trans. Biomed. Eng., 44, pp. 1269, 10.1109/10.649999
  • Lu, (2008), Biomed. Signal Process. Control, 3, pp. 244, 10.1016/j.bspc.2008.04.003
  • Serbes, (2013), Digit. Signal Process., 23, pp. 1012, 10.1016/j.dsp.2012.12.009
  • Stasiakiewicz, (2021), Biomed. Signal Process. Control, 67, 10.1016/j.bspc.2021.102521
  • Hadjileontiadis, (2005), IEEE Trans. Biomed. Eng., 52, pp. 1143, 10.1109/TBME.2005.846706
  • Hadjileontiadis, (2005), IEEE Trans. Biomed. Eng., 52, pp. 1050, 10.1109/TBME.2005.846717
  • Pinho, (2015), Procedia Comput. Sci., 64, pp. 705, 10.1016/j.procs.2015.08.592
  • Pal, (2021), Biomed. Signal Process. Control, 66, 10.1016/j.bspc.2021.102454
  • Liu, (2015), pp. 1
  • Rizal, (2016), pp. 1
  • Hadjileontiadis, (2007), IEEE Eng. Med. Biol. Mag., 26, pp. 30
  • Mastorocostas, (2007), Comput. Biol. Med., 37, pp. 60, 10.1016/j.compbiomed.2005.10.006
  • Maruf, (2015), pp. 267
  • Mendes, (2016), pp. 3679
  • Li, (2016), pp. 1412
  • Grønnesby, (2017)
  • Pramudita, (2020)
  • García, (2020), Health Technol., 10, pp. 459, 10.1007/s12553-019-00365-w
  • Liu, (2017), pp. 765
  • Hong, (2018), Biomed. Signal Process. Control, 46, pp. 94, 10.1016/j.bspc.2018.07.004
  • Bardou, (2018), Artif. Intell. Med., 88, pp. 58, 10.1016/j.artmed.2018.04.008
  • Nguyen, (2020), pp. 760
  • Messner, (2018), pp. 356
  • Messner, (2020), Comput. Biol. Med., 122, 10.1016/j.compbiomed.2020.103831
  • Taplidou, (2007), Comput. Biol. Med., 37, pp. 1073, 10.1016/j.compbiomed.2006.09.007
  • Jain, (2008), pp. 2582
  • Jin, (2011), IEEE Trans. Biomed. Eng., 58, pp. 3078, 10.1109/TBME.2011.2160721
  • Mendes, (2015), pp. 5581
  • Ulukaya, (2015), pp. 5412
  • Naves, (2016), Comput. Methods Programs Biomed., 129, pp. 12, 10.1016/j.cmpb.2016.02.013
  • Cortes, (2006), pp. 6141
  • Qiu, (2005), Proc. Inst. Mech. Eng. H J. Eng. Med., 219, pp. 219, 10.1243/095441105X28551
  • Le Cam, (2009), pp. 541
  • Hashemi, (2011), pp. 127
  • Ulukaya, (2019), Comput. Biol. Med., 104, pp. 175, 10.1016/j.compbiomed.2018.11.004
  • Wisniewski, (2012), pp. 472
  • Wiśniewski, (2014), IEEE J. Biomed. Health Inf., 19, pp. 1009
  • Bahoura, (2004), pp. 1309
  • Mayorga, (2010), pp. 6312
  • Zhang, (2009), pp. 331
  • Bahoura, (2009), Comput. Biol. Med., 39, pp. 824, 10.1016/j.compbiomed.2009.06.011
  • Lin, (2015), J. Healthc. Eng., 6, pp. 649, 10.1260/2040-2295.6.4.649
  • Kochetov, (2017), pp. 162
  • Oletic, (2017), IEEE J. Biomed. Health Inf., 22, pp. 1406, 10.1109/JBHI.2017.2781135
  • Torre-Cruz, (2020), Appl. Acoust., 161, 10.1016/j.apacoust.2019.107188
  • Cruz, (2020), Expert Syst. Appl., 147
  • Rocha, (2019), Physiol. Meas., 40, 10.1088/1361-6579/ab03ea
  • (2017)
  • Serbes, (2017), pp. 45
  • Jakovljević, (2017), pp. 39
  • Chambres, (2018), pp. 1
  • Aykanat, (2017), EURASIP J. Image Video Process., 2017, pp. 1, 10.1186/s13640-017-0213-2
  • Kochetov, (2018), pp. 208
  • Liu, (2019), pp. 298
  • Perna, (2019), pp. 50
  • Minami, (2019), pp. 804
  • Ma, (2019), pp. 1
  • Ngo, (2021), pp. 42
  • Acharya, (2020), IEEE Trans. Biomed. Circuits Syst., 14, pp. 535
  • Demir, (2020), IEEE Access, 8, pp. 105376, 10.1109/ACCESS.2020.3000111
  • Saraiva, (2020), pp. 138
  • Ma, (2020), pp. 2902
  • Yang, (2020)
  • Asatani, (2021), Comput. Electr. Eng., 94, 10.1016/j.compeleceng.2021.107367
  • Ntalampiras, (2021), Evol. Syst., 12, pp. 69, 10.1007/s12530-020-09339-0
  • Chanane, (2021), pp. 555
  • Zulfiqar, (2021), Front. Med., 8, 10.3389/fmed.2021.714811
  • Abdelkader, (2021), Front. Med.
  • Kim, (2021), Sci. Rep., 11, pp. 1
  • Song, (2021), pp. 1275
  • Gairola, (2021), pp. 527
  • Srivastava, (2021), PeerJ Comput. Sci., 7, 10.7717/peerj-cs.369
  • Tariq, (2022), Sensors, 22, pp. 1521, 10.3390/s22041521
  • Choi, (2022), IEEE Access
  • Nguyen, (2022), IEEE Trans. Biomed. Eng., 10.1109/TBME.2022.3156293
  • Zhao, (2022), pp. 9102
  • Saldanha, (2022), PLoS One, 17, 10.1371/journal.pone.0266467
  • Petmezas, (2022), Sensors, 22, pp. 1232, 10.3390/s22031232
  • N. Gajhede, O. Beck, H. Purwins, Convolutional neural networks with batch normalization for classifying hi-hat, snare, and bass percussion sound samples, in: Proceedings of the Audio Mostly 2016, 2016, pp. 111–115.
  • Kwon, (2019), Sensors, 20, pp. 183, 10.3390/s20010183
  • Kim, (2018), IEEE Trans. Neural Netw. Learn. Syst., 30, pp. 11, 10.1109/TNNLS.2018.2829819
  • Tian, (2020), Neural Netw., 124, pp. 117, 10.1016/j.neunet.2019.12.024
  • Acharya, (2017), Comput. Biol. Med., 89, pp. 389, 10.1016/j.compbiomed.2017.08.022
  • Baghel, (2020), Comput. Methods Programs Biomed., 197, 10.1016/j.cmpb.2020.105750
  • Shuvo, (2020), IEEE J. Biomed. Health Inf., 25, pp. 2595, 10.1109/JBHI.2020.3048006
  • Jayalakshmy, (2020), Artif. Intell. Med., 103, 10.1016/j.artmed.2020.101809
  • Fraiwan, (2021), J. Ambient Intell. Humaniz. Comput., pp. 1
  • Rocha, (2020), Sensors, 21, pp. 57, 10.3390/s21010057
  • Roads, (1996)
  • S. Schulz, T. Herfet, Binaural source separation in non-ideal reverberant environments, in: Proceedings of 10th International Conference on Digital Audio Effects (DAFx-07), Bordeaux, France, 2007.
  • Chen, (2014), IEEE/ACM Trans. Audio Speech Lang. Process., 22, pp. 1993, 10.1109/TASLP.2014.2359159
  • Sharan, (2019), Appl. Acoust., 148, pp. 62, 10.1016/j.apacoust.2018.12.006
  • Das, (2020), Comput. Biol. Med., 126, 10.1016/j.compbiomed.2020.103990
  • Krizhevsky, (2012), Adv. Neural Inf. Process. Syst., 25
  • He, (2015)
  • Simonyan, (2014)
  • Davis, (1980), IEEE Trans. Acoust. Speech Signal Process., 28, pp. 357, 10.1109/TASSP.1980.1163420
  • O’shaughnessy, (1987)
  • Nakamura, (2016), pp. 3457
  • Xu, (2021), pp. 1125
  • Purwins, (2019), IEEE J. Sel. Top. Sign. Proces., 13, pp. 206, 10.1109/JSTSP.2019.2908700
  • R.D. Patterson, I. Nimmo-Smith, J. Holdsworth, P. Rice, An efficient auditory filterbank based on the gammatone function, in: A Meeting of the IOC Speech Group on Auditory Modelling At RSRE. Vol. 2. No. 7, 1987.
  • Valero, (2012), IEEE Trans. Multimed., 14, pp. 1684, 10.1109/TMM.2012.2199972
  • Gao, (2014), J. Acoust. Soc. Am., 135, pp. 1171, 10.1121/1.4864294
  • Berrar, (2019), pp. 542
  • Wilcoxon, (1992), pp. 196
  • Mann, (1947), Ann. Math. Stat., pp. 50, 10.1214/aoms/1177730491
  • Deng, (2009), pp. 248