Thanh Thieu, PhD
Thanh Thieu, PhD
Program: Machine Learning
Research Program: Health Outcomes & Behavior Program
-
Overview
Dr. Thieu’s research centers on using natural language processing (NLP) and multi-modal machine learning to advance the forefront of cancer informatics. His work has been resulting in cutting-edge methods for cancer registry, ICD code identification, functional status identification, disease progression, disease recurrence, early-prediction modeling.
Associations
- Machine Learning
- Health Outcomes & Behavior Program
Education & Training
Graduate:
- University of Missouri, PhD - Computer Science
Fellowship:
- National Institute of Health Clinical Center - Health Informatics
-
Research Interest
Dr. Thieu’s research centers on using natural language processing (NLP) and multi-modal machine learning to advance the forefront of cancer informatics. His work has been resulting in cutting-edge methods for cancer registry, ICD code identification, functional status identification, disease progression, disease recurrence, early-prediction modeling.
-
Publications
- Zitu MM, Le TD, Duong T, Haddadan S, Garcia M, Amorrortu R, Zhao Y, Rollison DE, Thieu T. Large language models in cancer: potentials, risks, and safeguards. BJR Artif Intell. 2025 Jan.2(1):ubae019. Pubmedid: 39777117. Pmcid: PMC11703354.
- Lu Y, Duong T, Miao Z, Thieu T, Lamichhane J, Ahmed A, Delen D. A novel hyperparameter search approach for accuracy and simplicity in disease prediction risk scoring. J Am Med Inform Assoc. 2024 Aug.31(8):1763-1773. Pubmedid: 38899502. Pmcid: PMC11258418.
- Amorrortu R, Garcia M, Zhao Y, El Naqa I, Balagurunathan Y, Chen DT, Thieu T, Schabath MB, Rollison DE. Overview of approaches to estimate real-world disease progression in lung cancer. JNCI Cancer Spectr. 2023 Oct.7(6). Pubmedid: 37738580. Pmcid: PMC10637832.
- Le TD, Nguyen PD, Korkin D, Thieu T. PHILM2Web: A high-throughput database of macromolecular host-pathogen interactions on the Web. Database (Oxford). 2022 Jun.2022. Pubmedid: 35776535. Pmcid: PMC9248916.
- Thieu T, Maldonado JC, Ho PS, Ding M, Marr A, Brandt D, Newman-Griffis D, Zirikly A, Chan L, Rasch E. A comprehensive study of mobility functioning information in clinical notes: Entity hierarchy, corpus annotation, and sequence labeling. Int J Med Inform. 2021 Mar.147:104351. Pubmedid: 33401169. Pmcid: PMC8104034.
- Newman-Griffis D, Porcino J, Zirikly A, Thieu T, Camacho Maldonado J, Ho PS, Ding M, Chan L, Rasch E. Broadening horizons: the case for capturing function and the role of health informatics in its use. BMC Public Health. 2019 Oct.19(1):1288. Pubmedid: 31615472. Pmcid: PMC6794808.
- Thieu T, Joshi S, Warren S, Korkin D. Literature mining of host-pathogen interactions: comparing feature-based supervised learning and language-based approaches. Bioinformatics. 2012 Mar.28(6):867-875. Pubmedid: 22285561.
-
Grants
- Title: Extending the Capabilities and Reach of EMERSE in Support of Cancer Research
Award Number: 5U24CA269315-02
Sponsor: National Institutes of Health (NIH)
Thieu, T. (PD/PI) - Title: Applying Large Language Models to Accelerate Abstraction of Cancer Pathology Reports for Cancer Registry (LLMs for Unstructured Data Extraction)
Award Number: 3P30CA076292-25S4
Sponsor: National Cancer Institute (NCI)
Cleveland, J. (PD/PI), Thieu, T. (PD/PI)
- Title: Extending the Capabilities and Reach of EMERSE in Support of Cancer Research