Risk Factors and Predictive Modeling for Post-Acute Sequelae of SARS-CoV-2 Infection: Findings from EHR Cohorts of the RECOVER Initiative.

TitleRisk Factors and Predictive Modeling for Post-Acute Sequelae of SARS-CoV-2 Infection: Findings from EHR Cohorts of the RECOVER Initiative.
Publication TypeJournal Article
Year of Publication2023
AuthorsZang C, Hou Y, Schenck E, Xu Z, Zhang Y, Xu J, Bian J, Morozyuk D, Khullar D, Nordvig A, Shenkman E, Rothman R, Block J, Lyman K, Zhang Y, Varma J, Weiner M, Carton T, Wang F, Kaushal R, Consortium TRecover
JournalRes Sq
Date Published2023 Mar 08

Background Patients who were SARS-CoV-2 infected could suffer from newly incidental conditions in their post-acute infection period. These conditions, denoted as the post-acute sequelae of SARS-CoV-2 infection (PASC), are highly heterogeneous and involve a diverse set of organ systems. Limited studies have investigated the predictability of these conditions and their associated risk factors. Method In this retrospective cohort study, we investigated two large-scale PCORnet clinical research networks, INSIGHT and OneFlorida+, including 11 million patients in the New York City area and 16.8 million patients from Florida, to develop machine learning prediction models for those who are at risk for newly incident PASC and to identify factors associated with newly incident PASC conditions. Adult patients aged 20 with SARS-CoV-2 infection and without recorded infection between March 1 st , 2020, and November 30 th , 2021, were used for identifying associated factors with incident PASC after removing background associations. The predictive models were developed on infected adults. Results We find several incident PASC, e.g., malnutrition, COPD, dementia, and acute kidney failure, were associated with severe acute SARS-CoV-2 infection, defined by hospitalization and ICU stay. Older age and extremes of weight were also associated with these incident conditions. These conditions were better predicted (C-index >0.8). Moderately predictable conditions included diabetes and thromboembolic disease (C-index 0.7-0.8). These were associated with a wider variety of baseline conditions. Less predictable conditions included fatigue, anxiety, sleep disorders, and depression (C-index around 0.6). Conclusions This observational study suggests that a set of likely risk factors for different PASC conditions were identifiable from EHRs, predictability of different PASC conditions was heterogeneous, and using machine learning-based predictive models might help in identifying patients who were at risk of developing incident PASC.

Alternate JournalRes Sq
PubMed ID36945608
PubMed Central IDPMC10029117
Institute of Artificial Intelligence for Digital Health
Faculty Publication