深度学习技术在咽喉内镜应用中的研究进展及前景分析

doi:10.6040/j.issn.1673-3770.0.2023.470

摘要/Abstract

摘要： 深度学习的出现对医疗水平特别是医学检查的进步起到了巨大的推动作用,耳鼻咽喉头颈外科部分领域亦因此获益,基于深度学习的咽喉内镜检查数据分析领域近5年来做出了极有成效的尝试。本文以近5年基于深度学习的咽喉内镜应用及相关研究作为讨论主体,分析该领域的研究进程并将其发展阶段划分为神经网络萌芽阶段、神经网络与医学的交融和适用性发展的神经网络阶段三个阶段;以临床、样本信息、其他三个方面分别讨论现阶段研究瓶颈,并阐述了未来可能的解决方案及发展前景,指出了当前咽喉内镜中深度学习应用的主要障碍,并给出了未来多中心研究、多任务学习、高水平信息数据采集等可能的发展趋势展望。

关键词: 咽喉内镜, 人工智能, 深度学习, 耳鼻咽喉头颈外科学, 计算机辅助诊断

Abstract: The emergence of deep learning has played a huge role in the promotion of medical quality, especially in the advancement of medical examination, and some areas of otolaryngology and head and neck surgery have benefited from it. On the basis of deep learning, the field of endoscopic analysis of the pharynx and larynx has made very effective attempts in the past five years. This article discusses the research and related research based on deep learning in pharyngeal and laryngeal endoscopic application in the past five years, analyzes the research progress in this field, and divides its development stage into three stages: the stage of neural network germination, the blending of neural network and medicine, and the development of neural network applicability. Based on clinical, sample information and other three aspects, this article discusses current research bottlenecks, expounds possible solutions and development prospects in the future, points out the main obstacles in the application of deep learning in current pharyngeal and laryngeal endoscopic research, and gives a possible development trend outlook in multiple aspects such as multicenter research, multitask learning, high-level data information collection in the future.

Key words: Pharyngeal and laryngeal endoscopy, artificial Intelligence, deep learning, Otorhinolaryngology head and neck surgery, Computer-assisted diagnosis

中图分类号:

R762

程卓, 梁辉, 邢鲁民. 深度学习技术在咽喉内镜应用中的研究进展及前景分析[J]. 山东大学耳鼻喉眼学报, 2026, 40(1): 112-119.

CHENG Zhuo, LIANG Hui, XING Lumin. Research progress and prospect analysis of deep learning technology in the application of pharyngeal and laryngeal endoscopy[J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2026, 40(1): 112-119.

参考文献

[1] Zhong NN, Wang HQ, Huang XY, et al. Enhancing head and neck tumor management with artificial intelligence: integration and perspectives[J]. Semin Cancer Biol, 2023, 95: 52-74. doi:10.1016/j.semcancer.2023.07.002
[2] 朱志玲, 李松, 管国芳. 人工智能在耳鼻咽喉头颈外科的运用及展望[J]. 山东大学耳鼻喉眼学报, 34(2): 115-120. doi:10.6040/i.issn.1673-3770.0.2019.598 ZHU Zhiling, LI Song, GUAN Guofang. Application and prospect of artificial intelligence in otolaryngology[J]. Journal ofOtolaryngology and Ophthalmology ofShandong University, 34(2): 115-120. doi:10.6040/i.issn.1673-3770.0.2019.598
[3] 刘佳钰, 樊慧明, 邹游, 等. 人工智能在鼻咽癌诊断与治疗中的应用研究进展[J]. 山东大学耳鼻喉眼学报, 2023, 37(2): 135-142. doi:10.6040/j.issn.1673-3770.0.2022.089 LIU Jiayu, FAN Huiming, ZOU You, et al. Research progress on the application of artificial intelligence in the diagnosis and treatment of nasopharyn-geal carcinoma[J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(2): 135-142. doi:10.6040/j.issn.1673-3770.0.2022.089
[4] Abe S, Oda I. Real-time pharyngeal cancer detection utilizing artificial intelligence: journey from the proof of concept to the clinical use[J]. Dig Endosc, 2021, 33(4): 552-553. doi:10.1111/den.13833
[5] Li CF, Jing BZ, Ke LR, et al. Development and validation of an endoscopic images-based deep learning model for detection with nasopharyngeal malignancies[J]. Cancer Commun, 2018, 38(1): 59. doi:10.1186/s40880-018-0325-9
[6] (·overZ)urek M, Jasak K, Niemczyk K, et al. Artificial intelligence in laryngeal endoscopy: systematic review and meta-analysis[J]. J Clin Med, 2022, 11(10): 2752. doi:10.3390/jcm11102752
[7] Sampieri C, Baldini C, Azam MA, et al. Artificial intelligence for upper aerodigestive tract endoscopy and laryngoscopy: a guide for physicians and state-of-the-art review[J]. Otolaryngol Head Neck Surg, 2023, 169(4): 811-829. doi:10.1002/ohn.343
[8] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. doi:10.1126/science.1127647
[9] Esteva A, Robicquet A, Ramsundar B, et al. A guide to deep learning in healthcare[J]. Nat Med, 2019, 25(1): 24-29. doi:10.1038/s41591-018-0316-z
[10] Mascharak S, Baird BJ, Holsinger FC. Detecting oropharyngeal carcinoma using multispectral, narrow-band imaging and machine learning[J]. Laryngoscope, 2018, 128(11): 2514-2520. doi:10.1002/lary.27159
[11] Tamashiro A, Yoshio T, Ishiyama A, et al. Artificial intelligence-based detection of pharyngeal cancer using convolutional neural networks[J]. Dig Endosc, 2020, 32(7): 1057-1065. doi:10.1111/den.13653
[12] Wang YY, Hamad AS, Lever TE, et al. Orthogonal region selection network for laryngeal closure detection in laryngoscopy videos[C] //2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society(EMBC). Montreal, QC, Canada. IEEE, 2020: 2167-2172. doi:10.1109/EMBC44109.2020.9176149
[13] Gómez P, Kist AM, Schlegel P, et al. BAGLS, a multihospital benchmark for automatic glottis segmentation[J]. Sci Data, 2020, 7(1): 186. doi:10.1038/s41597-020-0526-3
[14] Yin L, Liu Y, Pei MT, et al. laryngoscope8: Laryngeal image dataset and classification of laryngeal disease based on attention mechanism[J]. Pattern Recognit Lett, 2021, 150(C): 207-213. doi:10.1016/j.patrec.2021.06.034
[15] Xiong H, Lin PL, Yu JG, et al. Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images[J]. EBioMedicine, 2019, 48: 92-99. doi:10.1016/j.ebiom.2019.08.075
[16] Fehling MK, Grosch F, Schuster ME, et al. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network[J]. PLoS One, 2020, 15(2): e0227791. doi:10.1371/journal.pone.0227791
[17] Dunham ME, Kong KA, McWhorter AJ, et al. Optical biopsy: automated classification of airway endoscopic findings using a convolutional neural network[J]. Laryngoscope, 2022, 132(Suppl 4): S1-S8. doi:10.1002/lary.28708
[18] Wang SX, Li Y, Zhu JQ, et al. The detection of nasopharyngeal carcinomas using a neural network based on nasopharyngoscopic images[J]. Laryngoscope, 2024, 134(1): 127-135. doi:10.1002/lary.30781
[19] Ren JJ, Jing XP, Wang J, et al. Automatic recognition of laryngoscopic images using a deep-learning technique[J]. Laryngoscope, 2020, 130(11): E686-E693. doi:10.1002/lary.28539
[20] Cho WK, Choi SH. Comparison of convolutional neural network models for determination of vocal fold normality in laryngoscopic images[J]. J Voice, 2022, 36(5): 590-598. doi:10.1016/j.jvoice.2020.08.003
[21] Kruse E, Dollinger M, Schutzenberger A, et al. GlottisNetV2: temporal glottal midline detection using deep convolutional neural networks[J]. IEEE J Transl Eng Health Med, 2023, 11: 137-144. doi:10.1109/JTEHM.2023.3237859
[22] Kist AM, Gómez P, Dubrovskiy D, et al. A deep learning enhanced novel software tool for laryngeal dynamics analysis[J]. J Speech Lang Hear Res, 2021, 64(6): 1889-1903. doi:10.1044/2021_JSLHR-20-00498
[23] Kist AM, Breininger K, Dörrich M, et al. A single latent channel is sufficient for biomedical glottis segmentation[J]. Sci Rep, 2022, 12(1): 14292. doi:10.1038/s41598-022-17764-1
[24] Adamian N, Naunheim MR, Jowett N. An open-source computer vision tool for automated vocal fold tracking from videoendoscopy[J]. Laryngoscope, 2021, 131(1): E219-E225. doi:10.1002/lary.28669
[25] Wang TV, Adamian N, Song PC, et al. Application of a computer vision tool for automated glottic tracking to vocal fold paralysis patients[J]. Otolaryngol Head Neck Surg, 2021, 165(4): 556-562. doi:10.1177/0194599821989608
[26] Moccia S, Vanone GO, Momi E, et al. Learning-based classification of informative laryngoscopic frames[J]. Comput Methods Programs Biomed, 2018, 158: 21-30. doi:10.1016/j.cmpb.2018.01.030
[27] Ni XG, Zhang QQ, Wang GQ. Narrow band imaging versus autofluorescence imaging for head and neck squamous cell carcinoma detection: a prospective study[J]. J Laryngol Otol, 2016, 130(11): 1001-1006. doi:10.1017/S0022215116009002
[28] Kraft M, Fostiropoulos K, Gürtler N, et al. Value of narrow band imaging in the early diagnosis of laryngeal cancer[J]. Head Neck, 2016, 38(1): 15-20. doi:10.1002/hed.23838
[29] Yang Y, Liu J, Song F, et al. The clinical diagnostic value of target biopsy using narrow-band imaging endoscopy and accurate laryngeal carcinoma pathologic specimen acquisition[J]. Clin Otolaryngol, 2017, 42(1): 38-45. doi:10.1111/coa.12654
[30] He YR, Cheng YD, Huang ZG, et al. A deep convolutional neural network-based method for laryngeal squamous cell carcinoma diagnosis[J]. Ann Transl Med, 2021, 9(24): 1797. doi:10.21037/atm-21-6458
[31] Xu JW, Wang J, Bian XZ, et al. Deep learning for nasopharyngeal carcinoma identification using both white light and narrow-band imaging endoscopy[J]. Laryngoscope, 2022, 132(5): 999-1007. doi:10.1002/lary.29894
[32] Weng JJ, Wei JZ, Wei YZ, et al. Diagnosis of nasopharyngeal carcinoma with convolutional neural network on narrowband imaging[J]. Lin Chuang Er Bi Yan Hou Tou Jing Wai Ke Za Zhi, 2023, 37(6): 483-486. doi:10.13201/j.issn.2096-7993.2023.06.015
[33] Lin JY, Walsted ES, Backer V, et al. Quantification and analysis of laryngeal closure from endoscopic videos[J]. IEEE Trans Biomed Eng, 2019, 66(4): 1127-1136. doi:10.1109/TBME.2018.2867636
[34] Patrini I, Ruperti M, Moccia S, et al. Transfer learning for informative-frame selection in laryngoscopic videos through learned features[J]. Med Biol Eng Comput, 2020, 58(6): 1225-1238. doi:10.1007/s11517-020-02127-7
[35] Cho WK, Lee YJ, Joo HA, et al. Diagnostic accuracies of laryngeal diseases using a convolutional neural network-based image classification system[J]. Laryngoscope, 2021, 131(11): 2558-2566. doi:10.1002/lary.29595
[36] Laves MH, Bicker J, Kahrs LA, et al. A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation[J]. Int J Comput Assist Radiol Surg, 2019, 14(3): 483-492. doi:10.1007/s11548-018-01910-0
[37] Parker F, Brodsky MB, Akst LM, et al. Machine learning in laryngoscopy analysis: a proof of concept observational study for the identification of post-extubation ulcerations and granulomas[J]. Ann Otol Rhinol Laryngol, 2021, 130(3): 286-291. doi:10.1177/0003489420950364
[38] Kist AM, Dürr S, Schützenberger A, et al. OpenHSV: an open platform for laryngeal high-speed videoendoscopy[J]. Sci Rep, 2021, 11(1): 13760. doi:10.1038/s41598-021-93149-0
[39] Matava C, Pankiv E, Raisbeck S, et al. A convolutional neural network for real time classification, identification, and labelling of vocal cord and tracheal using laryngoscopy and bronchoscopy video[J]. J Med Syst, 2020, 44(2): 44. doi:10.1007/s10916-019-1481-4
[40] Girdler B, Moon H, Bae MR, et al. Feasibility of a deep learning-based algorithm for automated detection and classification of nasal polyps and inverted papillomas on nasal endoscopic images[J]. Int Forum Allergy Rhinol, 2021, 11(12): 1637-1646. doi:10.1002/alr.22854
[41] Kim GH, Sung ES, Nam KW. Automated laryngeal mass detection algorithm for home-based self-screening test based on convolutional neural network[J]. Biomed Eng Online, 2021, 20(1): 51. doi:10.1186/s12938-021-00886-4
[42] Kono M, Ishihara R, Kato Y, et al. Diagnosis of pharyngeal cancer on endoscopic video images by Mask region-based convolutional neural network[J]. Dig Endosc, 2021, 33(4): 569-576. doi:10.1111/den.13800
[43] Ay B, Turker C, Emre E, et al. Automated classification of nasal polyps in endoscopy video-frames using handcrafted and CNN features[J]. Comput Biol Med, 2022, 147: 105725. doi:10.1016/j.compbiomed.2022.105725
[44] Azam MA, Sampieri C, Ioppi A, et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: toward real-time laryngeal cancer detection[J]. Laryngoscope, 2022, 132(9): 1798-1806. doi:10.1002/lary.29960
[45] Wellenstein DJ, Woodburn J, Marres HAM, et al. Detection of laryngeal carcinoma during endoscopy using artificial intelligence[J]. Head Neck, 2023, 45(9): 2217-2226. doi:10.1002/hed.27441
[46] Heo J, Lim JH, Lee HR, et al. Deep learning model for tongue cancer diagnosis using endoscopic images[J]. Sci Rep, 2022, 12(1): 6281. doi:10.1038/s41598-022-10287-9
[47] Nakajo K, Ninomiya Y, Kondo H, et al. Anatomical classification of pharyngeal and laryngeal endoscopic images using artificial intelligence[J]. Head Neck, 2023, 45(6): 1549-1557. doi:10.1002/hed.27370
[48] Zhao Q, He YQ, Wu YD, et al. Vocal cord lesions classification based on deep convolutional neural network and transfer learning[J]. Med Phys, 2022, 49(1): 432-442. doi:10.1002/mp.15371
[49] Pedersen M, Larsen CF, Madsen B, et al. Localization and quantification of glottal gaps on deep learning segmentation of vocal folds[J]. Sci Rep, 2023, 13(1): 878. doi:10.1038/s41598-023-27980-y
[50] Sakthivel S, Prabhu V. Optimal deep learning-based vocal fold disorder detection and classification model on high-speed video endoscopy[J]. J Healthc Eng, 2022: 4248938. doi:10.1155/2022/4248938
[51] Yan PK, Li SH, Zhou Z, et al. Automated detection of glottic laryngeal carcinoma in laryngoscopic images from a multicentre database using a convolutional neural network[J]. Clin Otolaryngol, 2023, 48(3): 436-441. doi:10.1111/coa.14029

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed