With big data too much is never enough

August 1, 2013

Meanwhile, the data collected isn't exactly in the right format for secondary use

Recently the Institute of Medicine of the National Academies made a point that in healthcare, you can never have too much data.

Michael Murray, executive director of the Regenstrief Center for Healthcare Effectiveness Research at the Regenstrief Institute, says the idea of the report, “Making the Case for Continuous Learning from Routinely Collected Data,” is to engage the public in the idea that secondary uses of their health data that could benefit all types of patients. Not sharing information could be a missed opportunity.

IOM authors note that patients and the public can be effective advocates for resetting expectations and encouraging use of routinely collected data to improve care and learn from others' experiences. In fact, most patients believe that medical records are already being mined for continuous learning, even though such big data efforts are still in their infancy. 

“There are certain hazards out there that the only way you can understand them is by looking at large repositories of data,” Murray says.

For example, when the drug Vioxx was determined to increase the risk of heart attacks, researchers believed the risk could have been detected sooner if trial data and claims data had been analyzed.

In addition to finding methods for preventing harm, large data studies could:

• Improve disease monitoring and tracking;
better target medical services for improved health outcomes and cost savings;

• Help inform both patients and clinicians to improve how they make decisions during clinical visits;

• Avoid harm to patients and unnecessary costs associated with repeat testing and delivery of unsuccessful treatments; and

• Accelerate and improve the use of research in routine medical care to answer medical questions more effectively and efficiently.

Aggregated data, also known as big data, can be modeled to mimic real world application.

“There may be certain things we’re doing that we need to broadcast that are really helping people. If they’re done in one healthcare system, another healthcare system can benefit from using the same approach,” Murray explains.

Daryl Wansink, director of healthcare research and evaluation at Blue Cross and Blue Shield of North Carolina (BCBSNC), agrees with the potential benefits of shared healthcare information. But the devil is in the details-as well as the execution, he says. Sharing data isn’t as simple as it seems, even beyond patients’ privacy concerns.

Many data systems still don’t talk to each other. Earlier this year, a Black Book Market Research survey showed that 23% of physician practices were frustrated enough with their EHRs to consider changing vendors. Dissatisfied users reported problems interfacing with other software, overly complex connectivity and concerns related to integration with mobile devices.

Meanwhile, the data that is currently collected often isn’t optimized for secondary purposes. On top of that, data can be incomplete or duplicated and skew the intelligence that results from the analysis.

Wansink notes that although BCBSNC fully supports the North Carolina Health Information Exchange and has executives on its board, progress toward leveraging data across the community has not been swift.

“It’s the reality of trying to integrate so many disparate systems,” he says, “It’s a very challenging goal to integrate of the information, even regionally."

One difficulty comes with using claims data for clinical purposes, according to Wansink, because it’s a transactional coding that lacks comprehensive detail. For example, it’s possible for a person without diabetes to have a claim with a diabetes diagnosis code on it, possibly due to an error or a rule-out diagnosis.

 “From a claims perspective, there are a lot of ways we’re trying to ensure the accuracy of the data, and I would expect the same would need to happen on the side of electronic medical records,” says Wansink. “These are human beings entering information into these systems, and human beings are prone to error. You need to put processes in place to clean up that data entry.”

Another difficulty is combining electronic medical records with claims data. Wansink notes that many companies are striving to do this but finding it challenging because electronic medical records are unstructured data.

 “You might want to know whether there’s evidence of diabetes in a record. You’d need to mine the text for relevant words indicating the person has diabetes or find results of an HbA1c test,” Wansink says. “Building those rules on unstructured data is complex and resource intensive, and while you can learn some good rules about how to do it for one disease, they don’t always apply to another disease.”

In fact, Wansink calls combining electronic medical records with claims data in a meaningful and coherent fashion the “holy grail” of health information exchange.

Regenstrief’s Murray says that assessment protocols must be in place to test the reliability of data being exchanged and that testing must take place at certain intervals.

Most health systems have these routines in place, but if they don’t, they are sorely needed because anything can happen with the reliability and accuracy of data,” he says.

Murray points out that hardware and software have advanced to the point where processing speed is no longer an issue, but healthcare continues to lag other industries in finding answers to thorny problems using big data. “In finance, we can analyze enormous amounts of information to understand what’s happening in markets and countries, etc.,” he says. “We need to get better at this in healthcare.”