How EHRs and Machine Learning Can Identify PrEP Candidates

July 17, 2019

New tools can help to identify patients at a high risk for HIV but not using PrEP.

A new modelling study used electronic health record (EHR) data and machine learning to identification candidates for HIV pre-exposure prophylaxis (PrEP). 

The study, published in The Lancet, used machine learning with EHR data from 3.7 million patients during 2007 to 2017 who at Kaiser Permanente Northern California, a large integrated healthcare system, to identify patients who were at high risk of HIV but not using PrEP.

The researchers extracted demographic and clinical data from these patients’ EHRs on over 80 potential predictors of HIV risk. They then split the sample, developing an HIV prediction model in 2007 to 2014 data and validating it in 2015 to 2017 data. The researchers then used a machine learning algorithm to automate selection of important HIV risk predictors for the final model. 

The machine learning algorithm retained 44 variables in the final model, including sex, race, living in a neighborhood with high HIV incidence, use of medications for erectile dysfunction, and sexually transmitted infections testing and positivity. The full model had excellent discrimination between HIV cases and non-cases, with a C-statistic of 0.84 in the validation dataset.

“We found that predictive modeling can be used to identify patients who are at high risk of HIV acquisition and may be good candidates for PrEP,” says lead study author Julia Marcus, PhD, MPH, adjunct faculty at The Fenway Institute, in Boston, and an assistant professor in the department of population medicine at Harvard Medical School and Harvard Pilgrim Health Care Institute. “Our study suggests that HIV prediction models could be embedded into EHRs as an automated screening tool to help identify the subset of patients most likely to benefit from PrEP

By flagging 2.2% of the general patient population as potential PrEP candidates, the researchers’ model identified nearly half of male HIV cases.

“We also showed the added value of rich EHR data for identification of potential PrEP candidates, with a full model outperforming simpler models based on only sexual orientation and recent bacterial sexually transmitted infections,” Marcus tells Managed Healthcare Executive. “The full model performed equally well in predicting HIV risk for black and white patients, whereas simpler models did not perform as well for black patients.”

Related: Walgreens Offers Free HIV Testing

PrEP is more than 90% effective in preventing HIV infection, but of 1.1 million people in the U.S. who are eligible for PrEP, only 7% used it in 2016, according to Marcus. The federal government recently announced the “Ending the HIV Epidemic: A Plan for America” initiative, which aims to reduce new HIV infections by 90% by 2030.

Scaling up PrEP is a key pillar of that initiative, says Marcus.

“One barrier to PrEP prescribing is that providers have difficulty identifying patients who are at high risk of HIV acquisition,” she says. “Risk prediction tools are often used in other areas of medicine, such as cardiovascular disease, to help providers identify patients who might benefit from preventive care. However, existing HIV risk prediction tools have limitations, including racial bias, and are not routinely used. Automated prediction tools that use EHR data to identify potential PrEP candidates could shift the paradigm for how PrEP is prescribed in the U.S., ultimately improving PrEP uptake and reducing new HIV infections.”

Prediction models are routinely used in other areas of medicine, including cardiovascular disease, fracture risk, and sepsis, according to Marcus.

The use of algorithms, machine learning, predictive models, are all approaches that could help improve patient care and move clinicians and hospitals to be more proactive, according to Deborah (Deb) Pasko, PharmD, MHA, director, Performance Center and Optimization Services at Omnicell. "A proactive approach can identify patients at risk of illness to the point that the illness itself could be completely prevented or the symptoms lessened to lead to enhanced quality of life and decreased cost to the healthcare system," Pasko says. "Pharmacogenomics and the use of machine learning can optimize medication prescribing that is truly tailored to a specific patient (precision medicine). The ultimate outcome would be optimizing population health outcomes."

One could argue what mathematical approaches are most predictive, but the bottom line really comes down to the use of data and using it in a meaningful, systematic way, according to Pasko. "Machine learning can definitely make huge strides for the care of patients but we are still living in an age where clinical judgement and human interactions are still needed and valuable," she says.