Journal of Otolaryngology and Ophthalmology of Shandong University ›› 2026, Vol. 40 ›› Issue (1): 112-119.doi: 10.6040/j.issn.1673-3770.0.2023.470

• Review • Previous Articles    

Research progress and prospect analysis of deep learning technology in the application of pharyngeal and laryngeal endoscopy

CHENG Zhuo1,2, LIANG Hui2, XING Lumin3   

  1. 1. Department of Otorhinolaryngology & Head and Neck Surgery, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan 250021, Shandong, China2. Department of Otorhinolaryngology & Head and Neck Surgery, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan 250014, Shandong, China3. Department of Information, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong, Jinan 250014, Shandong, China
  • Published:2026-02-13

Abstract: The emergence of deep learning has played a huge role in the promotion of medical quality, especially in the advancement of medical examination, and some areas of otolaryngology and head and neck surgery have benefited from it. On the basis of deep learning, the field of endoscopic analysis of the pharynx and larynx has made very effective attempts in the past five years. This article discusses the research and related research based on deep learning in pharyngeal and laryngeal endoscopic application in the past five years, analyzes the research progress in this field, and divides its development stage into three stages: the stage of neural network germination, the blending of neural network and medicine, and the development of neural network applicability. Based on clinical, sample information and other three aspects, this article discusses current research bottlenecks, expounds possible solutions and development prospects in the future, points out the main obstacles in the application of deep learning in current pharyngeal and laryngeal endoscopic research, and gives a possible development trend outlook in multiple aspects such as multicenter research, multitask learning, high-level data information collection in the future.

Key words: Pharyngeal and laryngeal endoscopy, artificial Intelligence, deep learning, Otorhinolaryngology head and neck surgery, Computer-assisted diagnosis

CLC Number: 

  • R762
[1] Zhong NN, Wang HQ, Huang XY, et al. Enhancing head and neck tumor management with artificial intelligence: integration and perspectives[J]. Semin Cancer Biol, 2023, 95: 52-74. doi:10.1016/j.semcancer.2023.07.002
[2] 朱志玲, 李松, 管国芳. 人工智能在耳鼻咽喉头颈外科的运用及展望[J]. 山东大学耳鼻喉眼学报, 34(2): 115-120. doi:10.6040/i.issn.1673-3770.0.2019.598 ZHU Zhiling, LI Song, GUAN Guofang. Application and prospect of artificial intelligence in otolaryngology[J]. Journal ofOtolaryngology and Ophthalmology ofShandong University, 34(2): 115-120. doi:10.6040/i.issn.1673-3770.0.2019.598
[3] 刘佳钰, 樊慧明, 邹游, 等. 人工智能在鼻咽癌诊断与治疗中的应用研究进展[J]. 山东大学耳鼻喉眼学报, 2023, 37(2): 135-142. doi:10.6040/j.issn.1673-3770.0.2022.089 LIU Jiayu, FAN Huiming, ZOU You, et al. Research progress on the application of artificial intelligence in the diagnosis and treatment of nasopharyn-geal carcinoma[J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(2): 135-142. doi:10.6040/j.issn.1673-3770.0.2022.089
[4] Abe S, Oda I. Real-time pharyngeal cancer detection utilizing artificial intelligence: journey from the proof of concept to the clinical use[J]. Dig Endosc, 2021, 33(4): 552-553. doi:10.1111/den.13833
[5] Li CF, Jing BZ, Ke LR, et al. Development and validation of an endoscopic images-based deep learning model for detection with nasopharyngeal malignancies[J]. Cancer Commun, 2018, 38(1): 59. doi:10.1186/s40880-018-0325-9
[6] (·overZ)urek M, Jasak K, Niemczyk K, et al. Artificial intelligence in laryngeal endoscopy: systematic review and meta-analysis[J]. J Clin Med, 2022, 11(10): 2752. doi:10.3390/jcm11102752
[7] Sampieri C, Baldini C, Azam MA, et al. Artificial intelligence for upper aerodigestive tract endoscopy and laryngoscopy: a guide for physicians and state-of-the-art review[J]. Otolaryngol Head Neck Surg, 2023, 169(4): 811-829. doi:10.1002/ohn.343
[8] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. doi:10.1126/science.1127647
[9] Esteva A, Robicquet A, Ramsundar B, et al. A guide to deep learning in healthcare[J]. Nat Med, 2019, 25(1): 24-29. doi:10.1038/s41591-018-0316-z
[10] Mascharak S, Baird BJ, Holsinger FC. Detecting oropharyngeal carcinoma using multispectral, narrow-band imaging and machine learning[J]. Laryngoscope, 2018, 128(11): 2514-2520. doi:10.1002/lary.27159
[11] Tamashiro A, Yoshio T, Ishiyama A, et al. Artificial intelligence-based detection of pharyngeal cancer using convolutional neural networks[J]. Dig Endosc, 2020, 32(7): 1057-1065. doi:10.1111/den.13653
[12] Wang YY, Hamad AS, Lever TE, et al. Orthogonal region selection network for laryngeal closure detection in laryngoscopy videos[C] //2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society(EMBC). Montreal, QC, Canada. IEEE, 2020: 2167-2172. doi:10.1109/EMBC44109.2020.9176149
[13] Gómez P, Kist AM, Schlegel P, et al. BAGLS, a multihospital benchmark for automatic glottis segmentation[J]. Sci Data, 2020, 7(1): 186. doi:10.1038/s41597-020-0526-3
[14] Yin L, Liu Y, Pei MT, et al. laryngoscope8: Laryngeal image dataset and classification of laryngeal disease based on attention mechanism[J]. Pattern Recognit Lett, 2021, 150(C): 207-213. doi:10.1016/j.patrec.2021.06.034
[15] Xiong H, Lin PL, Yu JG, et al. Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images[J]. EBioMedicine, 2019, 48: 92-99. doi:10.1016/j.ebiom.2019.08.075
[16] Fehling MK, Grosch F, Schuster ME, et al. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network[J]. PLoS One, 2020, 15(2): e0227791. doi:10.1371/journal.pone.0227791
[17] Dunham ME, Kong KA, McWhorter AJ, et al. Optical biopsy: automated classification of airway endoscopic findings using a convolutional neural network[J]. Laryngoscope, 2022, 132(Suppl 4): S1-S8. doi:10.1002/lary.28708
[18] Wang SX, Li Y, Zhu JQ, et al. The detection of nasopharyngeal carcinomas using a neural network based on nasopharyngoscopic images[J]. Laryngoscope, 2024, 134(1): 127-135. doi:10.1002/lary.30781
[19] Ren JJ, Jing XP, Wang J, et al. Automatic recognition of laryngoscopic images using a deep-learning technique[J]. Laryngoscope, 2020, 130(11): E686-E693. doi:10.1002/lary.28539
[20] Cho WK, Choi SH. Comparison of convolutional neural network models for determination of vocal fold normality in laryngoscopic images[J]. J Voice, 2022, 36(5): 590-598. doi:10.1016/j.jvoice.2020.08.003
[21] Kruse E, Dollinger M, Schutzenberger A, et al. GlottisNetV2: temporal glottal midline detection using deep convolutional neural networks[J]. IEEE J Transl Eng Health Med, 2023, 11: 137-144. doi:10.1109/JTEHM.2023.3237859
[22] Kist AM, Gómez P, Dubrovskiy D, et al. A deep learning enhanced novel software tool for laryngeal dynamics analysis[J]. J Speech Lang Hear Res, 2021, 64(6): 1889-1903. doi:10.1044/2021_JSLHR-20-00498
[23] Kist AM, Breininger K, Dörrich M, et al. A single latent channel is sufficient for biomedical glottis segmentation[J]. Sci Rep, 2022, 12(1): 14292. doi:10.1038/s41598-022-17764-1
[24] Adamian N, Naunheim MR, Jowett N. An open-source computer vision tool for automated vocal fold tracking from videoendoscopy[J]. Laryngoscope, 2021, 131(1): E219-E225. doi:10.1002/lary.28669
[25] Wang TV, Adamian N, Song PC, et al. Application of a computer vision tool for automated glottic tracking to vocal fold paralysis patients[J]. Otolaryngol Head Neck Surg, 2021, 165(4): 556-562. doi:10.1177/0194599821989608
[26] Moccia S, Vanone GO, Momi E, et al. Learning-based classification of informative laryngoscopic frames[J]. Comput Methods Programs Biomed, 2018, 158: 21-30. doi:10.1016/j.cmpb.2018.01.030
[27] Ni XG, Zhang QQ, Wang GQ. Narrow band imaging versus autofluorescence imaging for head and neck squamous cell carcinoma detection: a prospective study[J]. J Laryngol Otol, 2016, 130(11): 1001-1006. doi:10.1017/S0022215116009002
[28] Kraft M, Fostiropoulos K, Gürtler N, et al. Value of narrow band imaging in the early diagnosis of laryngeal cancer[J]. Head Neck, 2016, 38(1): 15-20. doi:10.1002/hed.23838
[29] Yang Y, Liu J, Song F, et al. The clinical diagnostic value of target biopsy using narrow-band imaging endoscopy and accurate laryngeal carcinoma pathologic specimen acquisition[J]. Clin Otolaryngol, 2017, 42(1): 38-45. doi:10.1111/coa.12654
[30] He YR, Cheng YD, Huang ZG, et al. A deep convolutional neural network-based method for laryngeal squamous cell carcinoma diagnosis[J]. Ann Transl Med, 2021, 9(24): 1797. doi:10.21037/atm-21-6458
[31] Xu JW, Wang J, Bian XZ, et al. Deep learning for nasopharyngeal carcinoma identification using both white light and narrow-band imaging endoscopy[J]. Laryngoscope, 2022, 132(5): 999-1007. doi:10.1002/lary.29894
[32] Weng JJ, Wei JZ, Wei YZ, et al. Diagnosis of nasopharyngeal carcinoma with convolutional neural network on narrowband imaging[J]. Lin Chuang Er Bi Yan Hou Tou Jing Wai Ke Za Zhi, 2023, 37(6): 483-486. doi:10.13201/j.issn.2096-7993.2023.06.015
[33] Lin JY, Walsted ES, Backer V, et al. Quantification and analysis of laryngeal closure from endoscopic videos[J]. IEEE Trans Biomed Eng, 2019, 66(4): 1127-1136. doi:10.1109/TBME.2018.2867636
[34] Patrini I, Ruperti M, Moccia S, et al. Transfer learning for informative-frame selection in laryngoscopic videos through learned features[J]. Med Biol Eng Comput, 2020, 58(6): 1225-1238. doi:10.1007/s11517-020-02127-7
[35] Cho WK, Lee YJ, Joo HA, et al. Diagnostic accuracies of laryngeal diseases using a convolutional neural network-based image classification system[J]. Laryngoscope, 2021, 131(11): 2558-2566. doi:10.1002/lary.29595
[36] Laves MH, Bicker J, Kahrs LA, et al. A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation[J]. Int J Comput Assist Radiol Surg, 2019, 14(3): 483-492. doi:10.1007/s11548-018-01910-0
[37] Parker F, Brodsky MB, Akst LM, et al. Machine learning in laryngoscopy analysis: a proof of concept observational study for the identification of post-extubation ulcerations and granulomas[J]. Ann Otol Rhinol Laryngol, 2021, 130(3): 286-291. doi:10.1177/0003489420950364
[38] Kist AM, Dürr S, Schützenberger A, et al. OpenHSV: an open platform for laryngeal high-speed videoendoscopy[J]. Sci Rep, 2021, 11(1): 13760. doi:10.1038/s41598-021-93149-0
[39] Matava C, Pankiv E, Raisbeck S, et al. A convolutional neural network for real time classification, identification, and labelling of vocal cord and tracheal using laryngoscopy and bronchoscopy video[J]. J Med Syst, 2020, 44(2): 44. doi:10.1007/s10916-019-1481-4
[40] Girdler B, Moon H, Bae MR, et al. Feasibility of a deep learning-based algorithm for automated detection and classification of nasal polyps and inverted papillomas on nasal endoscopic images[J]. Int Forum Allergy Rhinol, 2021, 11(12): 1637-1646. doi:10.1002/alr.22854
[41] Kim GH, Sung ES, Nam KW. Automated laryngeal mass detection algorithm for home-based self-screening test based on convolutional neural network[J]. Biomed Eng Online, 2021, 20(1): 51. doi:10.1186/s12938-021-00886-4
[42] Kono M, Ishihara R, Kato Y, et al. Diagnosis of pharyngeal cancer on endoscopic video images by Mask region-based convolutional neural network[J]. Dig Endosc, 2021, 33(4): 569-576. doi:10.1111/den.13800
[43] Ay B, Turker C, Emre E, et al. Automated classification of nasal polyps in endoscopy video-frames using handcrafted and CNN features[J]. Comput Biol Med, 2022, 147: 105725. doi:10.1016/j.compbiomed.2022.105725
[44] Azam MA, Sampieri C, Ioppi A, et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: toward real-time laryngeal cancer detection[J]. Laryngoscope, 2022, 132(9): 1798-1806. doi:10.1002/lary.29960
[45] Wellenstein DJ, Woodburn J, Marres HAM, et al. Detection of laryngeal carcinoma during endoscopy using artificial intelligence[J]. Head Neck, 2023, 45(9): 2217-2226. doi:10.1002/hed.27441
[46] Heo J, Lim JH, Lee HR, et al. Deep learning model for tongue cancer diagnosis using endoscopic images[J]. Sci Rep, 2022, 12(1): 6281. doi:10.1038/s41598-022-10287-9
[47] Nakajo K, Ninomiya Y, Kondo H, et al. Anatomical classification of pharyngeal and laryngeal endoscopic images using artificial intelligence[J]. Head Neck, 2023, 45(6): 1549-1557. doi:10.1002/hed.27370
[48] Zhao Q, He YQ, Wu YD, et al. Vocal cord lesions classification based on deep convolutional neural network and transfer learning[J]. Med Phys, 2022, 49(1): 432-442. doi:10.1002/mp.15371
[49] Pedersen M, Larsen CF, Madsen B, et al. Localization and quantification of glottal gaps on deep learning segmentation of vocal folds[J]. Sci Rep, 2023, 13(1): 878. doi:10.1038/s41598-023-27980-y
[50] Sakthivel S, Prabhu V. Optimal deep learning-based vocal fold disorder detection and classification model on high-speed video endoscopy[J]. J Healthc Eng, 2022: 4248938. doi:10.1155/2022/4248938
[51] Yan PK, Li SH, Zhou Z, et al. Automated detection of glottic laryngeal carcinoma in laryngoscopic images from a multicentre database using a convolutional neural network[J]. Clin Otolaryngol, 2023, 48(3): 436-441. doi:10.1111/coa.14029
[1] ZHU Mingqiong, LI Zheng, LIU Ru, TIAN Tao, PENG Jingli, LYU Qianyi, TAN Huaxia. The application of AI screening system based on OCT/OCTA in the evaluation of the effect of anti VEGF treatment in patients with diabetes macular edema [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2026, 40(1): 68-73.
[2] ZHANG Guoming, WEI Wenbin, LIN Haotian, CHI Wei, ZHANG Shaochong, ZHAO Peiquan, LEI Baiying, CHEN Youxin, WANG Yusheng, HE Mingguang, LIANG Jianhong, LU Hai, LU Fang, HUANG Xin, LIANG Xiaoling, ZHAO Xinyu, WU Zhenquan, YU Zhen, CUI Kaixuan, LIU Yaling, XIANG Daoman, CHEN Changzheng, ZHANG Zifeng, LIN Duoru, YU Shanshan, SUN Yue, TAN Tao, CHEN Yanxian, PENG Jie, DONG Li, CHENG Yong, ZHU Xuemei, YANG Peng, CHEN Shaobin. Expert consensus on AI-assisted diagnosis and treatment of retinopathy of prematurity(2025) [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2025, 39(2): 1-5.
[3] SHEN Jiaqi, LI Xiaosa, BI Yanlong, ZHANG Jingfa. The application of artificial intelligence in screening, diagnosis and prognosis of diabetic macular edema [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2024, 38(5): 153-159.
[4] XIE Yulin, LEI Dapeng. Advances in the pathological study of artificial intelligence in the lymph node metastasis of head and neck squamous cell carcinoma [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2024, 38(3): 124-129.
[5] WU Lili, QU Yi. Application of optical tomography angiography and artificial intelligence in choroidal neovascularization secondary to pathologic myopia [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2024, 38(2): 144-149.
[6] SHI Zhenghao, ZHOU Liang, LI Chengjian, ZHANG Zhijun, ZHANG Yitong, YOU Zhenzhen, LUO Jing, CHEN Jingguo, LIU Haiqin, ZHAO Minghua, HEI Xinhong, REN Xiaoyong. Research progress of deep learning methods in sleep apnea detection [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(6): 46-61.
[7] ZHANG Yitong, LI Qingxiang, SHI Zhenghao, SHANG Lei, YUAN Yuqi, CAO Zine, MA Lina, LIU Haiqin, REN Xiaoyong, SHI Yewen. The sleep structure of Children with obstructive sleep apnea and the development of a sleep structure interpretation model [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(6): 126-132.
[8] DU Yueshanyi, WANG Xian, ZHANG Guoming. Progress in the diagnosis and treatment of retinopathy of prematurity using artificial intelligence [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(3): 157-162.
[9] LIU Jiayu, FAN Huiming, ZOU You, CHEN Shiming. Research progress on the application of artificial intelligence in the diagnosis and treatment of nasopharyngeal carcinoma [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2023, 37(2): 135-142.
[10] HUA Hongli, LI Song,TAO Zezhang. Research progress of artificial intelligence in the diagnosis and treatment of nasopharyngeal carcinoma [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2022, 36(2): 113-119.
[11] Huang Tianze, Chen Di,LI Ying. Advances of machine learning in the diagnosis of ocular surface diseases and guiding corneal surgical procedures [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2021, 35(6): 13-19.
[12] WANG Di, CHENG Jinzhang,YU Dan. Application of artificial intelligence based on machine learning in clinical diagnosis and treatment in otolaryngology [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2021, 35(6): 125-131.
[13] CHEN Haibing, WEI Ya'nan, XU Xiaoquan, CHEN Xi. Prediction of cervical lymph node metastasis in papillary thyroid cancer based on XGBoost artificial intelligence and enhanced computed tomography [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2020, 34(3): 40-45.
[14] ZHU Zhiling, LI SongOverview,GUAN GuofangGuidance. Application and prospect of artificial intelligence in otolaryngology [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2020, 34(2): 115-120.
[15] Changyu QIU,Jun ZHOU,De′en ZHUANG,Qing YANG,Meiping LU,Lei CHENG. Artificial intelligence technology application in disclosing allergic rhinitis patientsneeds to otolaryngologists [J]. Journal of Otolaryngology and Ophthalmology of Shandong University, 2019, 33(3): 88-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!