Skip to nav Skip to content
Thanh  Thieu

Thanh Thieu, PhD

Program: Machine Learning

Research Program: Health Outcomes & Behavior Program

View Lab Page

Contact

  • Overview

    Dr. Thieu’s research centers on using natural language processing (NLP) and multi-modal machine learning to advance the forefront of cancer informatics. His work has been resulting in cutting-edge methods for cancer registry, ICD code identification, functional status identification, disease progression, disease recurrence, early-prediction modeling.

    Associations

    • Machine Learning
    • Health Outcomes & Behavior Program

    Education & Training

    Graduate:

    • University of Missouri, PhD - Computer Science

    Fellowship:

    • National Institute of Health Clinical Center - Health Informatics
  • Research Interest

    Dr. Thieu’s research centers on using natural language processing (NLP) and multi-modal machine learning to advance the forefront of cancer informatics. His work has been resulting in cutting-edge methods for cancer registry, ICD code identification, functional status identification, disease progression, disease recurrence, early-prediction modeling.

  • Publications

    • Zitu MM, Le TD, Duong T, Haddadan S, Garcia M, Amorrortu R, Zhao Y, Rollison DE, Thieu T. Large language models in cancer: potentials, risks, and safeguards. BJR Artif Intell. 2025 Jan.2(1):ubae019. Pubmedid: 39777117. Pmcid: PMC11703354.
    • Lu Y, Duong T, Miao Z, Thieu T, Lamichhane J, Ahmed A, Delen D. A novel hyperparameter search approach for accuracy and simplicity in disease prediction risk scoring. J Am Med Inform Assoc. 2024 Aug.31(8):1763-1773. Pubmedid: 38899502. Pmcid: PMC11258418.
    • Amorrortu R, Garcia M, Zhao Y, El Naqa I, Balagurunathan Y, Chen DT, Thieu T, Schabath MB, Rollison DE. Overview of approaches to estimate real-world disease progression in lung cancer. JNCI Cancer Spectr. 2023 Oct.7(6). Pubmedid: 37738580. Pmcid: PMC10637832.
    • Le TD, Nguyen PD, Korkin D, Thieu T. PHILM2Web: A high-throughput database of macromolecular host-pathogen interactions on the Web. Database (Oxford). 2022 Jun.2022. Pubmedid: 35776535. Pmcid: PMC9248916.
    • Thieu T, Maldonado JC, Ho PS, Ding M, Marr A, Brandt D, Newman-Griffis D, Zirikly A, Chan L, Rasch E. A comprehensive study of mobility functioning information in clinical notes: Entity hierarchy, corpus annotation, and sequence labeling. Int J Med Inform. 2021 Mar.147:104351. Pubmedid: 33401169. Pmcid: PMC8104034.
    • Newman-Griffis D, Porcino J, Zirikly A, Thieu T, Camacho Maldonado J, Ho PS, Ding M, Chan L, Rasch E. Broadening horizons: the case for capturing function and the role of health informatics in its use. BMC Public Health. 2019 Oct.19(1):1288. Pubmedid: 31615472. Pmcid: PMC6794808.
    • Thieu T, Joshi S, Warren S, Korkin D. Literature mining of host-pathogen interactions: comparing feature-based supervised learning and language-based approaches. Bioinformatics. 2012 Mar.28(6):867-875. Pubmedid: 22285561.
  • Grants

    • Title: Extending the Capabilities and Reach of EMERSE in Support of Cancer Research
      Award Number: 5U24CA269315-02
      Sponsor: National Institutes of Health (NIH)
      Thieu, T. (PD/PI)
    • Title: Applying Large Language Models to Accelerate Abstraction of Cancer Pathology Reports for Cancer Registry (LLMs for Unstructured Data Extraction)
      Award Number: 3P30CA076292-25S4
      Sponsor: National Cancer Institute (NCI)
      Cleveland, J. (PD/PI), Thieu, T. (PD/PI)

Find a Researcher Search