Screening and confirmatory machine learning for explainable modeling of non-cancer deaths in cancer patients

National Institutes of Health, National Cancer Institute Award Number: R37CA277812 (September 1, 2022 - August 31, 2026)

 

 

Contact Information

Lanjing Zhang, MD

Department of Chemical Biology
Ernest Mario School of Pharmacy

Rutgers University
Office Room #: 107, 164 Frelinghuysen Rd.

Piscataway, NJ 08854
E-mail: Lanjing.Zhang #at# rutgers.edu, URL:
https://thezhanglab.github.io/

 

List of Contributors

·      Yongfeng Zhang, PhD, co-Investigator, School of Arts and Sciences, Rutgers University

·      Yong Lin, PhD, co-Investigator, School of Public Health, Rutgers University

·      Fei Deng, Postdoc research associate, School of Pharmacy, Rutgers University

·      Yunqi Li, PhD student, School of Arts and Sciences, Rutgers University

·      Mary (Nora) Disis, MD, co-Investigator, School of Medicine, Cancer Vaccine Institute, UW Medicine

·      Chao Cheng, PhD, co-Investigator, Baylor College of Medicine

·      Victoria VanUitert, PhD, Junior faculty mentee, Bowling Green State University, OH

·      Iris Shen, summer high school student, Winsor School, Boston, MA.

Project Award Information

·       Award Number: R37CA277812

·       Duration: September 1, 2022 - August 31, 2026

·       Title: Screening and confirmatory machine learning for explainable modeling of non-cancer deaths in cancer patients

·       Keywords:  Machine learning, omics, cancer, survival

Project Summary

Due to the high stakes of healthcare, the primary barrier is the extremely low tolerance of errors in healthcare practice, which requires extremely high sensitivity and specificity of any modelling. However, nearly all Machine learning (ML) models focus on improving the accuracy. It cannot yet reach both extremely high sensitivity and specificity using healthcare data. Separate screening and confirmatory ML tools are proposed to achieve very high sensitivity and specificity. Moreover, many ML algorithms suffer from the lack of clear explanations, such as deep learning and neural networks, and would unlikely meet the FAIR criteria. Cancer is the second leading cause of death in the U.S. The number of cancer survivors continues to grow; unfortunately, so does the number of non-cancer deaths in cancer patients. However, nearly all `omic and large population studies focused on binary outcomes (cancer death or recurrence). Therefore, there is an urgent need to better understand and reduce non-cancer deaths in cancer patients, using `omic and population data. To address these problems, the project will develop screening and confirmatory ML to model cancer and noncancer deaths in breast, colorectal, prostate and lung cancer patients using `omic data and electronic health records (EHR). The proposed research will result in fundamental contribution to ML tools, workflows and methods to make novel use of `omic and EHR data for cancer care. It timely meets the urgent needs in precise reduction of non-cancer deaths. This project also uniquely addresses the Transformative Data Science research theme. The interdisciplinary collaboration in this project as outlined in the Collaboration Plan will offer a diverse basis for creative problem solving and validation. The proposal has 3 broader impacts: 1) The developed novel ML algorithms and technology will enable physicians to more precisely prognosticate and treat cancer patients based on their risk of multicategory deaths. 2) The research program will support and nurture undergraduate and graduate researchers. 3) The proposed research program will support high school and undergraduate students both in the conduct of research and in awareness of ML usefulness. RELEVANCE (See instructions): The proposed research is relevant to public health because the development and better utilization novel machine learning for classifying non-cancer deaths in cancer patients is expected to reduce the morbidity and mortality in these patients. Thus, the proposed research is relevant to the part of the NIH's mission that pertains to developing fundamental knowledge that will help to lengthen human lives and reduce the burdens of human illness.

Publications and Products:

Note:  All full-text papers can be searched and downloaded in PDF, if legally available, at the PI's ResearchGate page.

Journal articles

·    Balasubramanian I,  Bandyopadhyay S, Flores J, Smak JB, Lin X, Liu H, Sun S, Golovchenko NB, Liu Y, Wang D, Patel R, Joseph II, Suntornsaratoon P, Vargas J, Green PHR, Bhagat Govind, Lagana SM, Ying W, Zhang Y, Wang Z, Li WV, Singh S, Zhou Z, Kollias G, Farr LA, Moonah SN, Yu S, Wei Z, Ferraris R, Bonder EM, Zhang L, Kiela PR, Edelblum KL, Liu TL, Gao N. Infection and inflammation stimulate expansion of a CD74+ Paneth cell subset to regulate disease progression. EMBO J. 2023 Nov 2;42(21):e113975 DOI: 10.15252/embj.2023113975 PMID: 37718683

·    Hu K, Zhang L. Challenges and Opportunities Associated with Lifting the Zero COVID-19 Policy in China. Explor Res Hypothesis Med. 2024 Jan-Mar;9(1):71-75. doi: 10.14218/erhm.2023.00002. Epub 2023 Mar 8. PMID: 38572142; PMCID:PMC10989839.

·    Liang Y, Guo GL, Zhang L. Current and Emerging Molecular Markers of Liver Diseases: A Pathogenic Perspective. Gene Expression 2022; 21(1), 9–19. doi: 10.14218/GEJLR.2022.00010 PMCID: PMC11192043

·    Cui M, Deng F, Disis ML, Cheng C, Zhang L. Advances in the Clinical Application of High-throughput Proteomics. Explor Res Hypothesis Med (in press).

 

Project Impact

 

§  Education: We are training junior faculty, postdoc associate, and undergraduate, graduate and high school students. Each of them will be taught based on their levels of research experiences, background and interest. For junior faculty and postdoc research associate, we will help them successfully obtain extramural grants and eventually become an independent scientist. For the high school, undergraduate and graduate students, we aim to help them obtain experiences and basic knowledge in machine learning and propel their career in science, engineering and/or medicine. Most of the software and coding developed in this project will made publicly available (see below). All new progress will be added into the other research collections upon completion.

§  Collaborations: In this project we have established collaborations with several schools of Rutgers University, UW Medicine and Baylor College of Medicine. Through such collaborations we expect to explore many real applications and produce bigger Research Impacts.

 

Current and Future Activities

The following are some of the highlights of our ongoing work. 

1.      Develop highly sensitive and specific machine learning algorithms to classify non-cancer causes in cancer patients.

2.      Study effective and scalable methods for improving machine learning fairness.

Potential Related Project(s)

·       IIS-2128307 EAGER: Integration and analysis of high-dimensional dataset

 

Project Web site URL:  https://thezhanglab.github.io/R37.html

Online software:  Online software will be downloaded at https://github.com/FeiDeng-RUTGERS/.