山东大学耳鼻喉眼学报 ›› 2026, Vol. 40 ›› Issue (1): 112-119.doi: 10.6040/j.issn.1673-3770.0.2023.470

• 综述 • 上一篇    

深度学习技术在咽喉内镜应用中的研究进展及前景分析

程卓1,2, 梁辉2, 邢鲁民3   

  1. 山东省医学科学院)耳鼻咽喉头颈外科, 山东 济南 250021;
    2.山东第一医科大学第一附属医院(山东省千佛山医院)耳鼻咽喉头颈外科, 山东 济南 250014;
    3.山东第一医科大学第一附属医院(山东省千佛山医院)信息中心, 山东 济南 250014
  • 发布日期:2026-02-13
  • 通讯作者: 梁辉. E-mail:onlinelh@163.com

Research progress and prospect analysis of deep learning technology in the application of pharyngeal and laryngeal endoscopy

CHENG Zhuo1,2, LIANG Hui2, XING Lumin3   

  1. 1. Department of Otorhinolaryngology & Head and Neck Surgery, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan 250021, Shandong, China2. Department of Otorhinolaryngology & Head and Neck Surgery, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan 250014, Shandong, China3. Department of Information, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong, Jinan 250014, Shandong, China
  • Published:2026-02-13

摘要: 深度学习的出现对医疗水平特别是医学检查的进步起到了巨大的推动作用,耳鼻咽喉头颈外科部分领域亦因此获益,基于深度学习的咽喉内镜检查数据分析领域近5年来做出了极有成效的尝试。本文以近5年基于深度学习的咽喉内镜应用及相关研究作为讨论主体,分析该领域的研究进程并将其发展阶段划分为神经网络萌芽阶段、神经网络与医学的交融和适用性发展的神经网络阶段三个阶段;以临床、样本信息、其他三个方面分别讨论现阶段研究瓶颈,并阐述了未来可能的解决方案及发展前景,指出了当前咽喉内镜中深度学习应用的主要障碍,并给出了未来多中心研究、多任务学习、高水平信息数据采集等可能的发展趋势展望。

关键词: 咽喉内镜, 人工智能, 深度学习, 耳鼻咽喉头颈外科学, 计算机辅助诊断

Abstract: The emergence of deep learning has played a huge role in the promotion of medical quality, especially in the advancement of medical examination, and some areas of otolaryngology and head and neck surgery have benefited from it. On the basis of deep learning, the field of endoscopic analysis of the pharynx and larynx has made very effective attempts in the past five years. This article discusses the research and related research based on deep learning in pharyngeal and laryngeal endoscopic application in the past five years, analyzes the research progress in this field, and divides its development stage into three stages: the stage of neural network germination, the blending of neural network and medicine, and the development of neural network applicability. Based on clinical, sample information and other three aspects, this article discusses current research bottlenecks, expounds possible solutions and development prospects in the future, points out the main obstacles in the application of deep learning in current pharyngeal and laryngeal endoscopic research, and gives a possible development trend outlook in multiple aspects such as multicenter research, multitask learning, high-level data information collection in the future.

Key words: Pharyngeal and laryngeal endoscopy, artificial Intelligence, deep learning, Otorhinolaryngology head and neck surgery, Computer-assisted diagnosis

中图分类号: 

  • R762
[1] Zhong NN, Wang HQ, Huang XY, et al. Enhancing head and neck tumor management with artificial intelligence: integration and perspectives[J]. Semin Cancer Biol, 2023, 95: 52-74. doi:10.1016/j.semcancer.2023.07.002
[2] 朱志玲, 李松, 管国芳. 人工智能在耳鼻咽喉头颈外科的运用及展望[J]. 山东大学耳鼻喉眼学报, 34(2): 115-120. doi:10.6040/i.issn.1673-3770.0.2019.598 ZHU Zhiling, LI Song, GUAN Guofang. Application and prospect of artificial intelligence in otolaryngology[J]. Journal ofOtolaryngology and Ophthalmology ofShandong University, 34(2): 115-120. doi:10.6040/i.issn.1673-3770.0.2019.598
[3] 刘佳钰, 樊慧明, 邹游, 等. 人工智能在鼻咽癌诊断与治疗中的应用研究进展[J]. 山东大学耳鼻喉眼学报, 2023, 37(2): 135-142. doi:10.6040/j.issn.1673-3770.0.2022.089 LIU Jiayu, FAN Huiming, ZOU You, et al. Research progress on the application of artificial intelligence in the diagnosis and treatment of nasopharyn-geal carcinoma[J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(2): 135-142. doi:10.6040/j.issn.1673-3770.0.2022.089
[4] Abe S, Oda I. Real-time pharyngeal cancer detection utilizing artificial intelligence: journey from the proof of concept to the clinical use[J]. Dig Endosc, 2021, 33(4): 552-553. doi:10.1111/den.13833
[5] Li CF, Jing BZ, Ke LR, et al. Development and validation of an endoscopic images-based deep learning model for detection with nasopharyngeal malignancies[J]. Cancer Commun, 2018, 38(1): 59. doi:10.1186/s40880-018-0325-9
[6] (·overZ)urek M, Jasak K, Niemczyk K, et al. Artificial intelligence in laryngeal endoscopy: systematic review and meta-analysis[J]. J Clin Med, 2022, 11(10): 2752. doi:10.3390/jcm11102752
[7] Sampieri C, Baldini C, Azam MA, et al. Artificial intelligence for upper aerodigestive tract endoscopy and laryngoscopy: a guide for physicians and state-of-the-art review[J]. Otolaryngol Head Neck Surg, 2023, 169(4): 811-829. doi:10.1002/ohn.343
[8] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. doi:10.1126/science.1127647
[9] Esteva A, Robicquet A, Ramsundar B, et al. A guide to deep learning in healthcare[J]. Nat Med, 2019, 25(1): 24-29. doi:10.1038/s41591-018-0316-z
[10] Mascharak S, Baird BJ, Holsinger FC. Detecting oropharyngeal carcinoma using multispectral, narrow-band imaging and machine learning[J]. Laryngoscope, 2018, 128(11): 2514-2520. doi:10.1002/lary.27159
[11] Tamashiro A, Yoshio T, Ishiyama A, et al. Artificial intelligence-based detection of pharyngeal cancer using convolutional neural networks[J]. Dig Endosc, 2020, 32(7): 1057-1065. doi:10.1111/den.13653
[12] Wang YY, Hamad AS, Lever TE, et al. Orthogonal region selection network for laryngeal closure detection in laryngoscopy videos[C] //2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society(EMBC). Montreal, QC, Canada. IEEE, 2020: 2167-2172. doi:10.1109/EMBC44109.2020.9176149
[13] Gómez P, Kist AM, Schlegel P, et al. BAGLS, a multihospital benchmark for automatic glottis segmentation[J]. Sci Data, 2020, 7(1): 186. doi:10.1038/s41597-020-0526-3
[14] Yin L, Liu Y, Pei MT, et al. laryngoscope8: Laryngeal image dataset and classification of laryngeal disease based on attention mechanism[J]. Pattern Recognit Lett, 2021, 150(C): 207-213. doi:10.1016/j.patrec.2021.06.034
[15] Xiong H, Lin PL, Yu JG, et al. Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images[J]. EBioMedicine, 2019, 48: 92-99. doi:10.1016/j.ebiom.2019.08.075
[16] Fehling MK, Grosch F, Schuster ME, et al. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network[J]. PLoS One, 2020, 15(2): e0227791. doi:10.1371/journal.pone.0227791
[17] Dunham ME, Kong KA, McWhorter AJ, et al. Optical biopsy: automated classification of airway endoscopic findings using a convolutional neural network[J]. Laryngoscope, 2022, 132(Suppl 4): S1-S8. doi:10.1002/lary.28708
[18] Wang SX, Li Y, Zhu JQ, et al. The detection of nasopharyngeal carcinomas using a neural network based on nasopharyngoscopic images[J]. Laryngoscope, 2024, 134(1): 127-135. doi:10.1002/lary.30781
[19] Ren JJ, Jing XP, Wang J, et al. Automatic recognition of laryngoscopic images using a deep-learning technique[J]. Laryngoscope, 2020, 130(11): E686-E693. doi:10.1002/lary.28539
[20] Cho WK, Choi SH. Comparison of convolutional neural network models for determination of vocal fold normality in laryngoscopic images[J]. J Voice, 2022, 36(5): 590-598. doi:10.1016/j.jvoice.2020.08.003
[21] Kruse E, Dollinger M, Schutzenberger A, et al. GlottisNetV2: temporal glottal midline detection using deep convolutional neural networks[J]. IEEE J Transl Eng Health Med, 2023, 11: 137-144. doi:10.1109/JTEHM.2023.3237859
[22] Kist AM, Gómez P, Dubrovskiy D, et al. A deep learning enhanced novel software tool for laryngeal dynamics analysis[J]. J Speech Lang Hear Res, 2021, 64(6): 1889-1903. doi:10.1044/2021_JSLHR-20-00498
[23] Kist AM, Breininger K, Dörrich M, et al. A single latent channel is sufficient for biomedical glottis segmentation[J]. Sci Rep, 2022, 12(1): 14292. doi:10.1038/s41598-022-17764-1
[24] Adamian N, Naunheim MR, Jowett N. An open-source computer vision tool for automated vocal fold tracking from videoendoscopy[J]. Laryngoscope, 2021, 131(1): E219-E225. doi:10.1002/lary.28669
[25] Wang TV, Adamian N, Song PC, et al. Application of a computer vision tool for automated glottic tracking to vocal fold paralysis patients[J]. Otolaryngol Head Neck Surg, 2021, 165(4): 556-562. doi:10.1177/0194599821989608
[26] Moccia S, Vanone GO, Momi E, et al. Learning-based classification of informative laryngoscopic frames[J]. Comput Methods Programs Biomed, 2018, 158: 21-30. doi:10.1016/j.cmpb.2018.01.030
[27] Ni XG, Zhang QQ, Wang GQ. Narrow band imaging versus autofluorescence imaging for head and neck squamous cell carcinoma detection: a prospective study[J]. J Laryngol Otol, 2016, 130(11): 1001-1006. doi:10.1017/S0022215116009002
[28] Kraft M, Fostiropoulos K, Gürtler N, et al. Value of narrow band imaging in the early diagnosis of laryngeal cancer[J]. Head Neck, 2016, 38(1): 15-20. doi:10.1002/hed.23838
[29] Yang Y, Liu J, Song F, et al. The clinical diagnostic value of target biopsy using narrow-band imaging endoscopy and accurate laryngeal carcinoma pathologic specimen acquisition[J]. Clin Otolaryngol, 2017, 42(1): 38-45. doi:10.1111/coa.12654
[30] He YR, Cheng YD, Huang ZG, et al. A deep convolutional neural network-based method for laryngeal squamous cell carcinoma diagnosis[J]. Ann Transl Med, 2021, 9(24): 1797. doi:10.21037/atm-21-6458
[31] Xu JW, Wang J, Bian XZ, et al. Deep learning for nasopharyngeal carcinoma identification using both white light and narrow-band imaging endoscopy[J]. Laryngoscope, 2022, 132(5): 999-1007. doi:10.1002/lary.29894
[32] Weng JJ, Wei JZ, Wei YZ, et al. Diagnosis of nasopharyngeal carcinoma with convolutional neural network on narrowband imaging[J]. Lin Chuang Er Bi Yan Hou Tou Jing Wai Ke Za Zhi, 2023, 37(6): 483-486. doi:10.13201/j.issn.2096-7993.2023.06.015
[33] Lin JY, Walsted ES, Backer V, et al. Quantification and analysis of laryngeal closure from endoscopic videos[J]. IEEE Trans Biomed Eng, 2019, 66(4): 1127-1136. doi:10.1109/TBME.2018.2867636
[34] Patrini I, Ruperti M, Moccia S, et al. Transfer learning for informative-frame selection in laryngoscopic videos through learned features[J]. Med Biol Eng Comput, 2020, 58(6): 1225-1238. doi:10.1007/s11517-020-02127-7
[35] Cho WK, Lee YJ, Joo HA, et al. Diagnostic accuracies of laryngeal diseases using a convolutional neural network-based image classification system[J]. Laryngoscope, 2021, 131(11): 2558-2566. doi:10.1002/lary.29595
[36] Laves MH, Bicker J, Kahrs LA, et al. A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation[J]. Int J Comput Assist Radiol Surg, 2019, 14(3): 483-492. doi:10.1007/s11548-018-01910-0
[37] Parker F, Brodsky MB, Akst LM, et al. Machine learning in laryngoscopy analysis: a proof of concept observational study for the identification of post-extubation ulcerations and granulomas[J]. Ann Otol Rhinol Laryngol, 2021, 130(3): 286-291. doi:10.1177/0003489420950364
[38] Kist AM, Dürr S, Schützenberger A, et al. OpenHSV: an open platform for laryngeal high-speed videoendoscopy[J]. Sci Rep, 2021, 11(1): 13760. doi:10.1038/s41598-021-93149-0
[39] Matava C, Pankiv E, Raisbeck S, et al. A convolutional neural network for real time classification, identification, and labelling of vocal cord and tracheal using laryngoscopy and bronchoscopy video[J]. J Med Syst, 2020, 44(2): 44. doi:10.1007/s10916-019-1481-4
[40] Girdler B, Moon H, Bae MR, et al. Feasibility of a deep learning-based algorithm for automated detection and classification of nasal polyps and inverted papillomas on nasal endoscopic images[J]. Int Forum Allergy Rhinol, 2021, 11(12): 1637-1646. doi:10.1002/alr.22854
[41] Kim GH, Sung ES, Nam KW. Automated laryngeal mass detection algorithm for home-based self-screening test based on convolutional neural network[J]. Biomed Eng Online, 2021, 20(1): 51. doi:10.1186/s12938-021-00886-4
[42] Kono M, Ishihara R, Kato Y, et al. Diagnosis of pharyngeal cancer on endoscopic video images by Mask region-based convolutional neural network[J]. Dig Endosc, 2021, 33(4): 569-576. doi:10.1111/den.13800
[43] Ay B, Turker C, Emre E, et al. Automated classification of nasal polyps in endoscopy video-frames using handcrafted and CNN features[J]. Comput Biol Med, 2022, 147: 105725. doi:10.1016/j.compbiomed.2022.105725
[44] Azam MA, Sampieri C, Ioppi A, et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: toward real-time laryngeal cancer detection[J]. Laryngoscope, 2022, 132(9): 1798-1806. doi:10.1002/lary.29960
[45] Wellenstein DJ, Woodburn J, Marres HAM, et al. Detection of laryngeal carcinoma during endoscopy using artificial intelligence[J]. Head Neck, 2023, 45(9): 2217-2226. doi:10.1002/hed.27441
[46] Heo J, Lim JH, Lee HR, et al. Deep learning model for tongue cancer diagnosis using endoscopic images[J]. Sci Rep, 2022, 12(1): 6281. doi:10.1038/s41598-022-10287-9
[47] Nakajo K, Ninomiya Y, Kondo H, et al. Anatomical classification of pharyngeal and laryngeal endoscopic images using artificial intelligence[J]. Head Neck, 2023, 45(6): 1549-1557. doi:10.1002/hed.27370
[48] Zhao Q, He YQ, Wu YD, et al. Vocal cord lesions classification based on deep convolutional neural network and transfer learning[J]. Med Phys, 2022, 49(1): 432-442. doi:10.1002/mp.15371
[49] Pedersen M, Larsen CF, Madsen B, et al. Localization and quantification of glottal gaps on deep learning segmentation of vocal folds[J]. Sci Rep, 2023, 13(1): 878. doi:10.1038/s41598-023-27980-y
[50] Sakthivel S, Prabhu V. Optimal deep learning-based vocal fold disorder detection and classification model on high-speed video endoscopy[J]. J Healthc Eng, 2022: 4248938. doi:10.1155/2022/4248938
[51] Yan PK, Li SH, Zhou Z, et al. Automated detection of glottic laryngeal carcinoma in laryngoscopic images from a multicentre database using a convolutional neural network[J]. Clin Otolaryngol, 2023, 48(3): 436-441. doi:10.1111/coa.14029
[1] 朱明琼,李征,刘茹,田涛,彭婧利,吕倩怡,谭华霞. 基于OCT/OCTA的AI筛查系统在抗VEGF治疗糖尿病性黄斑水肿患者效果评价中的应用[J]. 山东大学耳鼻喉眼学报, 2026, 40(1): 68-73.
[2] 张国明,魏文斌,林浩添,迟玮,张少冲,赵培泉,雷柏英,陈有信,王雨生,何明光,梁建宏,卢海,陆方,黄欣,梁小玲,赵欣予,吴桢泉,余震,崔凯璇,刘亚玲,项道满,陈长征,张自峰,林铎儒,于珊珊,孙悦,檀韬,陈燕先,彭婕,董力,程湧,朱雪梅,杨鹏,陈少滨. 人工智能技术辅助早产儿视网膜病变诊疗专家共识(2025)[J]. 山东大学耳鼻喉眼学报, 2025, 39(2): 1-5.
[3] 沈嘉琪,李潇飒,毕燕龙,张敬法. 人工智能在DME筛查、诊断和预后中的应用[J]. 山东大学耳鼻喉眼学报, 2024, 38(5): 153-159.
[4] 谢玉林,雷大鹏. 人工智能在头颈部鳞状细胞癌淋巴结转移的病理研究进展[J]. 山东大学耳鼻喉眼学报, 2024, 38(3): 124-129.
[5] 吴丽丽,曲毅. OCTA在病理性近视脉络膜新生血管应用及其在人工智能的研究进展[J]. 山东大学耳鼻喉眼学报, 2024, 38(2): 144-149.
[6] 石争浩,周亮,李成建,张治军,张一彤,尤珍臻,罗靖,陈敬国,刘海琴,赵明华,黑新宏,任晓勇. 深度学习方法在睡眠呼吸暂停检测中的研究进展[J]. 山东大学耳鼻喉眼学报, 2023, 37(6): 46-61.
[7] 张一彤,李青香,石争浩,尚磊,袁钰淇,曹子讷,麻莉娜,刘海琴,任晓勇,施叶雯. 阻塞性睡眠呼吸暂停儿童睡眠结构研究及睡眠结构判读模型建立[J]. 山东大学耳鼻喉眼学报, 2023, 37(6): 126-132.
[8] 杜曰山一,王鲜,张国明. 人工智能辅助早产儿视网膜病变诊疗新进展[J]. 山东大学耳鼻喉眼学报, 2023, 37(3): 157-162.
[9] 刘佳钰,樊慧明,邹游,陈始明. 人工智能在鼻咽癌诊断与治疗中的应用研究进展[J]. 山东大学耳鼻喉眼学报, 2023, 37(2): 135-142.
[10] 华红利,李松,陶泽璋. 人工智能在鼻咽癌诊疗中的研究进展[J]. 山东大学耳鼻喉眼学报, 2022, 36(2): 113-119.
[11] 黄天泽,陈迪,李莹. 机器学习在眼表疾病诊断及角膜手术中的应用进展[J]. 山东大学耳鼻喉眼学报, 2021, 35(6): 13-19.
[12] 王迪,程金章,于丹. 基于机器学习的人工智能技术在耳鼻喉科临床诊疗中的应用进展[J]. 山东大学耳鼻喉眼学报, 2021, 35(6): 125-131.
[13] 陈海兵, 卫亚楠, 许晓泉, 陈曦. 基于XGBoost人工智能结合CT构建甲状腺癌颈部淋巴结转移预测模型[J]. 山东大学耳鼻喉眼学报, 2020, 34(3): 40-45.
[14] 朱志玲,李松,管国芳. 人工智能在耳鼻咽喉头颈外科的运用及展望[J]. 山东大学耳鼻喉眼学报, 2020, 34(2): 115-120.
[15] 邱昌余,周俊,庄德恩,杨晴,陆美萍,程雷. 人工智能技术在辅助耳鼻咽喉科医师了解过敏性鼻炎患者需求中的应用[J]. 山东大学耳鼻喉眼学报, 2019, 33(3): 88-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!