Trustworthy deep learning framework for the detection of abnormalities in X-ray shoulder images

  1. Alzubaidi, Laith
  2. Salhi, Asma
  3. A.Fadhel, Mohammed
  4. Bai, Jinshuai
  5. Hollman, Freek
  6. Italia, Kristine
  7. Pareyon, Roberto
  8. Albahri, A. S.
  9. Ouyang, Chun
  10. Santamaría, Jose
  11. Cutbush, Kenneth
  12. Gupta, Ashish
  13. Abbosh, Amin
  14. Gu, Yuantong
Revista:
PLOS ONE

ISSN: 1932-6203

Año de publicación: 2024

Volumen: 19

Número: 3

Páginas: e0299545

Tipo: Artículo

DOI: 10.1371/JOURNAL.PONE.0299545 GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: PLOS ONE

Resumen

Musculoskeletal conditions affect an estimated 1.7 billion people worldwide, causing intense pain and disability. These conditions lead to 30 million emergency room visits yearly, and the numbers are only increasing. However, diagnosing musculoskeletal issues can be challenging, especially in emergencies where quick decisions are necessary. Deep learning (DL) has shown promise in various medical applications. However, previous methods had poor performance and a lack of transparency in detecting shoulder abnormalities on X-ray images due to a lack of training data and better representation of features. This often resulted in overfitting, poor generalisation, and potential bias in decision-making. To address these issues, a new trustworthy DL framework has been proposed to detect shoulder abnormalities (such as fractures, deformities, and arthritis) using X-ray images. The framework consists of two parts: same-domain transfer learning (TL) to mitigate imageNet mismatch and feature fusion to reduce error rates and improve trust in the final result. Same-domain TL involves training pre-trained models on a large number of labelled X-ray images from various body parts and fine-tuning them on the target dataset of shoulder X-ray images. Feature fusion combines the extracted features with seven DL models to train several ML classifiers. The proposed framework achieved an excellent accuracy rate of 99.2%, F1Score of 99.2%, and Cohen’s kappa of 98.5%. Furthermore, the accuracy of the results was validated using three visualisation tools, including gradient-based class activation heat map (Grad CAM), activation visualisation, and locally interpretable model-independent explanations (LIME). The proposed framework outperformed previous DL methods and three orthopaedic surgeons invited to classify the test set, who obtained an average accuracy of 79.1%. The proposed framework has proven effective and robust, improving generalisation and increasing trust in the final results.

Información de financiación

Financiadores

Referencias bibliográficas

  • AW Olthof, (2021), Computer methods and programs in biomedicine, 208, pp. 106304, 10.1016/j.cmpb.2021.106304
  • M Han, (2022), Computer Methods and Programs in Biomedicine, 213, pp. 106498, 10.1016/j.cmpb.2021.106498
  • K.L. Jensen, (2016), Rockwood and Matsen’s The Shoulder, pp. 135
  • Rajpurkar P, Irvin J, Bagul A, Ding D, Duan T, Mehta H, et al. Mura: Large dataset for abnormality detection in musculoskeletal radiographs. arXiv preprint arXiv:1712.06957. 2017 Dec 11. arXiv preprint arXiv:1712.06957. Available from: https://arxiv.org/abs/1712.06957
  • WHO, World Health Organization:Musculoskeletal health, accessed: 01.10.2023. Available from: https://www.who.int/news-room/fact-sheets/detail/musculoskeletal-conditions
  • R Lindsey, (2018), Proceedings of the National Academy of Sciences, 115, pp. 11591, 10.1073/pnas.1806905115
  • E Sivari, (2022), Healthcare, 10, pp. 580, 10.3390/healthcare10030580
  • J Lee, (2022), Applied Sciences, 12, pp. 681, 10.3390/app12020681
  • PH Kalmet, (2020), Acta orthopaedica, 91, pp. 215, 10.1080/17453674.2019.1711323
  • X Mu, (2021), Computer Methods and Programs in Biomedicine, 211, pp. 106325, 10.1016/j.cmpb.2021.106325
  • L Alzubaidi, (2023), Journal of Big Data, 10, pp. 46, 10.1186/s40537-023-00727-2
  • MA Fadhel, (2024), Multimedia Tools and Applications, pp. 1
  • M Karri, (2023), Computer Methods and Programs in Biomedicine, 231, pp. 107408, 10.1016/j.cmpb.2023.107408
  • R Zoetmulder, (2022), Computer Methods and Programs in Biomedicine, 214, pp. 106539, 10.1016/j.cmpb.2021.106539
  • DH Kim, (2018), Clinical radiology, 73, pp. 439, 10.1016/j.crad.2017.11.015
  • L Alzubaidi, (2020), Applied Sciences, 10, pp. 4523, 10.3390/app10134523
  • L Alzubaidi, (2021), PeerJ Computer Science, 7, pp. e715, 10.7717/peerj-cs.715
  • AA Mukhlif, (2023), Sensors, 23, pp. 570, 10.3390/s23020570
  • M Raghu, (2019), Advances in neural information processing systems, 32
  • Azizi S, Mustafa B, Ryan F, Beaver Z, Freyberg J, Deaton J, et al. Big self-supervised models advance medical image classification. InProceedings of the IEEE/CVF international conference on computer vision 2021 (pp. 3478–3488). Available from: https://ieeexplore.ieee.org/abstract/document/9710396
  • S Azizi, (2023), Nature Biomedical Engineering, pp. 1
  • F Silva, (2021), IEEE Access, 9, pp. 58667, 10.1109/ACCESS.2021.3070701
  • BT Jin, (2022), Journal of the American Medical Informatics Association, 29, pp. 1908, 10.1093/jamia/ocac135
  • L Alzubaidi, (2021), Cancers, 13, pp. 1590, 10.3390/cancers13071590
  • E Tiu, (2022), Nature Biomedical Engineering, 6, pp. 1399, 10.1038/s41551-022-00936-9
  • CL Srinidhi, (2022), Medical Image Analysis, 75, pp. 102256, 10.1016/j.media.2021.102256
  • R Krishnan, (2022), Nature Biomedical Engineering, 6, pp. 1346, 10.1038/s41551-022-00914-1
  • Z Zhao, (2023), Expert Systems with Applications, pp. 122807
  • HE Kim, (2022), BMC medical imaging, 22, pp. 69, 10.1186/s12880-022-00793-7
  • F Uysal, (2021), Applied Sciences, 11, pp. 2723, 10.3390/app11062723
  • I Kandel, (2020), Journal of imaging, 6, pp. 127, 10.3390/jimaging6110127
  • I Kandel, (2021), Journal of Imaging, 7, pp. 100, 10.3390/jimaging7060100
  • SA Jebur, (2023), Computers, 12, pp. 175, 10.3390/computers12090175
  • L Alzubaidi, (2024), Engineering Applications of Artificial Intelligence, 129, pp. 107627, 10.1016/j.engappai.2023.107627
  • Alzubaidi L, Fadhel MA, Albahri AS, Salhi A, Gupta A, Gu Y. Domain Adaptation and Feature Fusion for the Detection of Abnormalities in X-Ray Forearm Images. In 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2023 Jul 24 (pp. 1–5). IEEE. Available from: https://doi.org/10.1109/EMBC40787.2023.10340309.
  • AH Al-Timemy, (2023), Diagnostics, 13, pp. 1689, 10.3390/diagnostics13101689
  • L Alzubaidi, (2023), International Journal of Intelligent Systems, 2023, 10.1155/2023/4459198
  • HW Loh, (2022), Computer Methods and Programs in Biomedicine, pp. 107161, 10.1016/j.cmpb.2022.107161
  • W Liang, (2022), Nature Machine Intelligence, 4, pp. 669, 10.1038/s42256-022-00516-1
  • A Saporta, (2022), Nature Machine Intelligence, 4, pp. 867, 10.1038/s42256-022-00536-x
  • AS Albahri, (2023), Information Fusion
  • M Varma, (2019), Nature Machine Intelligence, 1, pp. 578, 10.1038/s42256-019-0126-0
  • M He, (2021), Scientific Reports, 11, pp. 9097, 10.1038/s41598-021-88578-w
  • S Malik, (2022), Mathematics, 10, pp. 3291, 10.3390/math10183291
  • Manoila CP, Ciurea A, Albu F. SmartMRI Framework for Segmentation of MR Images Using Multiple Deep Learning Methods. In2022 E-Health and Bioengineering Conference (EHB) 2022 Nov 17 (pp. 01–04). IEEE. Available from: https://doi.org/10.1109/EHB55594.2022.9991496