Population Health Sciences

Supervised Pretraining through Contrastive Categorical Positive Samplings to Improve COVID-19 Mortality Prediction.

Submitted by chz4003 on April 11, 2023 - 5:41pm

Title	Supervised Pretraining through Contrastive Categorical Positive Samplings to Improve COVID-19 Mortality Prediction.
Publication Type	Journal Article
Year of Publication	2022
Authors	Wanyan T, Lin M, Klang E, Menon KM, Gulamali FF, Azad A, Zhang Y, Ding Y, Wang Z, Wang F, Glicksberg B, Peng Y
Journal	ACM BCB
Volume	2022
Date Published	2022 Aug
Abstract	Clinical EHR data is naturally heterogeneous, where it contains abundant sub-phenotype. Such diversity creates challenges for outcome prediction using a machine learning model since it leads to high intra-class variance. To address this issue, we propose a supervised pre-training model with a unique embedded k-nearest-neighbor positive sampling strategy. We demonstrate the enhanced performance value of this framework theoretically and show that it yields highly competitive experimental results in predicting patient mortality in real-world COVID-19 EHR data with a total of over 7,000 patients admitted to a large, urban health system. Our method achieves a better AUROC prediction score of 0.872, which outperforms the alternative pre-training models and traditional machine learning methods. Additionally, our method performs much better when the training data size is small (345 training instances).
DOI	10.1145/3535508.3545541
Alternate Journal	ACM BCB
PubMed ID	35960866
PubMed Central ID	PMC9365529
Grant List	R00 LM013001 / LM / NLM NIH HHS / United States

Division:

Institute of Artificial Intelligence for Digital Health

Category:

Faculty Publication