Research appearing in the ninth annual health information technology issue of The American Journal of Managed Care shows health systems can predict which patients will use the most healthcare just by collecting their age, gender, race, and address.
A “machine learning” model from Jvion Inc. predicted which patients were most likely to be admitted to the hospital or visit the emergency department (ED) over a 90-day period, based on an algorithm fueled by easily obtained socioeconomic data: each patient’s age, gender, race and address.
The study, appearing this week in the ninth annual health information technology issue of The American Journal of Managed Care, potentially offers health systems and policy makers a way to target groups of patients with “specific, individualized interventions to tackle detrimental social determinants of health,” at both the household and neighborhood levels.
The researchers found that among more than 20 demographic and socioeconomic determinants of health, air quality-both particles and ozone-was the strongest predictor of whether a person was likely to visit the ED or be hospitalized. Although the model’s developers expected air quality to be an important predictor of healthcare use, they did not expect it to be the most important one-outranking income, which was second.
As Medicare and other insurers move toward payment models that reward overall population health, systems are addressing social determinants of health directly, such as Kaiser Permanente’s investment in affordable housing in Oakland, Calif. However, recent reports say proposed changes to the National Environmental Policy Act could weaken requirements to assess air quality and other impacts of development.
The Jvion researchers say their results give health systems and government planners a tool to “predict hospital and ED utilization without data on clinical risk factors.”
“Instead,” they write, “predictive features are based on publicly available socioeconomic determinants of care and purchasable behavioral data.” Healthcare risk can be gleaned without having to engage the patient directly, which could ease burdens on staff.
Testing the Model
Authors from Jvion used deidentified data from health systems in Ohio, Georgia and Alabama, randomly divided into two groups: training data (about 70%) and testing data (about 30%). Overall, the data covered more than 138,000 patients who had at least one ambulatory, ED or inpatient visit to one of the health systems in November 2018.
A proprietary model then predicted healthcare utilization with the training data set, and the predictions could be confirmed with the testing data. The researchers used only age group, gender and race to make their predictions, using the claims data along with previously collected information from the US Census Bureau, the US Department of Agriculture and the National Oceanic and Atmospheric Administration.
Results showed that the event rate for the primary outcome, inpatient admission or ED admission within 90 days, was 4.7%; for secondary outcomes, event rates were 3.2% for an ED visit within 90 days, 2.0% for any inpatient visit within 90 days and 1.5% for an avoidable inpatient visit in 90 days.
The research team found the area under the curve (AUC) was nearly identical for both the training and testing data sets (0.84 vs 0.83) for the primary outcome. Similar AUC alignment was found for the secondary measures. Sensitivity values for the primary and secondary outcome measures were 0.79, 0.82, 0.73 and 0.75, respectively.