Addition by Extraction

Home | Subscribe | Resources | Reprints | Writers' Guidelines

October 2013

Addition by Extraction
By Elizabeth S. Roop
For The Record
Vol. 25 No. 14 P. 18

Health care organizations draw from rich EHR data to boost treatment and prevention plans.

With terabytes of information flowing through hospital systems, it makes sense that innovators are seeking ways to mine those data to unlock the secrets to healthier populations. Whether it be through treatments tailored to a patient’s genomes, real-time surveillance, or population-based health and wellness programs, industry leaders are devising ways to leverage their hard-earned EHR data.

What’s even more impressive is that many of these care improvement projects extend well beyond the facility housing the EHR data.

“I think what is happening outside the four walls [of a hospital] is important—how to get data from disparate systems, put it in place, and make sure all parties involved can get access,” says Todd Stockard, president of Valence Health, which works with provider-centric organizations to develop population health and financial risk management solutions. “When you’re looking at population health, what’s more important is what happens outside the four walls to keep people from getting [admitted] inside the four walls.”

A Shift From Capture to Use
Achieving the better population health management required to succeed in an accountable care environment is just one of several trends driving the increased focus on data mining, according to Ben Loop, senior director of analytics and business intelligence for Siemens Healthcare.

Also at play is the need to comply with increasingly complex reporting requirements under various federal programs, including value-based purchasing, meaningful use, and readmission reduction efforts.

Data mining also helps health care organizations prepare for looming reimbursement cuts, which will require a detailed understanding of the true marginal cost per unit of service. This brings with it a renewed focus on cost accounting and data warehousing that harkens back to the late 1980s. “That has been reinvigorated because, if Medicare reimbursements are dropping, you need to understand what the cost is to care for a population, deliver a service, and service lines,” Loop says. “How can you reallocate activities to lower-cost resources while still delivering optimal outcomes? We’re seeing a shift in emphasis from core systems capturing information to the reuse of that information and data, whether through business intelligence or Big Data and care coordination. There are a lot of different focus areas or domains, but all are focused on structure and reuse of data.”

In terms of how all this translates to significant care improvement, Loop says there are three areas of importance:

• Application of data to real-time decision support: Real-time analytics is replacing retrospective analysis to support diagnosis and treatment decisions at the point of care.

• Real-time analytics: Identifying ways to not only exchange patient information between providers and care settings but also how to apply it to specific patients and collaborative care.

• Information exchange: Bringing data together across multiple institutions to examine what is effective and what process changes are required to drive improvement.

“The categories aren’t crisply articulated yet, and the boundaries aren’t clear. It’s the Wild West right now, but most [organizations] are at least having the discussion about what or what not to do or getting their boards aligned,” Loop says.

As the potential for data mining comes into sharper focus, health care organizations must continue to struggle with the common foes of data integrity and standards. Loop notes that the move from data capture in core EHR technologies to intelligence and guidance systems, care coordination, and patient engagement requires data compilation from disparate systems with varying architecture and standards.

And while standards are evolving, the issues of stewardship and privacy are taking on greater importance “simply by virtue of the fact that information is more portable and providers more connected,” Loop says. “It’s also easier to reverse engineer deidentified data, so there are enhanced concerns about securing information so it’s not misused by other entities that are not directly involved in the care process. I don’t think there is any brief answer on how these issues can best be addressed. It’s complicated.”

Impact on Population Health
The complications perhaps are felt most sharply at the population health level. Indeed, Stockard notes that mining EHR data for population health purposes faces several challenges. First is that the data must be pulled together from disparate sources, which leads to consistency issues. It’s not uncommon to have to find a way to normalize and standardize as much as 90% of the data being integrated because of inconsistencies across platforms.

“The other thing we’ve found in dealing with clinical data and creating enterprisewide data warehouses is that while EMRs give a lot of information to help understand outcomes, the results of what you are trying to accomplish—predicting where your high-risk patients are—90% of what we need to do with population health resides in billing and administrative data,” Stockard says. “EMRs don’t even house CPT codes, so you really need both sources to have the full view of what happened, why, and what were the outcomes.”

In fact, he says defining what is sustainable on the clinical side enables the development of evidence-based guidelines based on claims data. For example, based on claims, it can be inferred that people with diabetes should have two office visits and referrals to an ophthalmologist and podiatrist each year. EMRs wind up playing a supplemental role in monitoring how well the guidelines are followed and their effectiveness at controlling the condition.

The University of Michigan Health System (UMHS) also is focused on applying data mined from its EHR to broader population health, primarily to address ambulatory chronic disease management in an accountable care–type setting. The organization uses a model developed during its participation in the Centers for Medicare & Medicaid Services (CMS) Medicare Physician Group Practice Demonstration project. UMHS was one of just two participants to measurably improve outcomes, resulting in a statistically significant reduction in Medicare readmissions and saving the CMS $47 million over a five-year period.

“This was based significantly on our ability to mine the data from outside payers, claims data, and UMHS’ EMR data in order to have what we called actionable reports,” says Chief Medical Information Officer Andrew Rosenberg, MD. “One of the key elements of those reports … was the ability to construct highly valid registries to accurately identify patients with conditions” such as diabetes, COPD, and high blood pressure.

Rosenberg notes that success requires mining and combining data from multiple sources, including laboratory, EMR, diagnostics, ePrescribing, and claims systems, to eliminate false-positives from the patient population. For example, to be considered a true diabetic, a patient must have lab values consistent with hemoglobin A1c levels for diabetes, been prescribed a medication to lower blood glucose levels, or have clear clinician documentation of a diabetes diagnosis. In other words, it wasn’t enough to identify patients where diabetes was suspected but not officially diagnosed. “With that kind of precise data, you can then create through mining the EMR an actionable report with interventions that are specific to those patients where they will have better utilization and the outcomes are more assured,” Rosenberg says.

The challenge is ensuring clinicians actually can use the generated report, particularly when claims and other nonclinical data are used to construct it. This involves standards, data merging, and the construction of new reports that flow into and out of the EMR.

“The technical and workflow challenges of linking disparate data to an individual in a format that a clinician or provider can actually really use when the patient shows up at the clinic and that is valid, updated, and reproducible is an ongoing technical challenge,” Rosenberg says. “The [idea] is to really increase the specificity or accuracy of exactly what you’re measuring because then it’s much more meaningful to providers to show true deviations to standards and quality of care. That is probably much more likely to improve care vs. broad interventions across heterogeneous populations. That’s the key to how we’re using EMR: to mine data in a highly specific and sensitive manner to yield more valid and actionable information.”

Predicting Cardiac Arrest
Data mining isn’t used only to drive big-picture care improvements though broad population health applications. The process also has a place inside the hospital focused on specific events. For example, the Parkland Center for Clinical Innovation (PCCI) has developed a real-time electronic predictive model designed to identify patients at high risk of cardiac arrest or death. It uses data within the EMR to provide active surveillance of all hospitalized patients in real time.

To predict a patient’s risk of cardiac arrest or death, the model utilizes 14 variables, including physiologic, laboratory, modified early warning score, high-risk floor assignment, and provider order data. It can detect with high accuracy the likelihood of a patient experiencing severe clinical deterioration an average of 16 hours prior to an event, which is six hours sooner than the average of the institutional rapid response team. “There are always going to be patients in the hospital who decompensate, and hospitals always want more time to improve patient outcomes,” says Holt Oliver, MD, PhD, vice president of clinical informatics at PCCI.

The automated electronic model, which was featured in the February issue of BMC Medical Informatics & Decision Making, was the result of a multidisciplinary effort involving IT, statisticians, and clinical researchers. The project started with a historic database of events, which was analyzed to cull out a list of what the team anticipated would be predictors of a future cardiac event. The next step is to test the model at Parkland Health & Hospital System.

Data validation poses a significant challenge to creating prototypes such as the cardiac intervention model. To defeat this hurdle, Oliver says PCCI conducts reviews “to ensure the elements we want are what we get at the back end. It requires knowing what is available and getting everyone in the same room to talk through issues.”

The implementation side, including governance of clinical decision support, faces a similar challenge, one that requires both simulation and testing to overcome. Ultimately, it’s worth the effort. “Physicians are trying to use every tool they can to improve performance,” Oliver says. “Based on the success we’ve had in reducing readmissions, leadership is now interested in seeing how it performs on more acute clinical events. They want to see how this extra lead time can lead to improved monitoring and patient outcomes.

“I anticipate there will be a growing interest in both the outpatient and inpatient worlds, and there will be more use cases for this general approach,” he adds.

Genome-Based Treatment
Another example of data mining’s vast potential to improve outcomes—and change the way medicine is practiced—are the efforts under way by researchers in the Electronic Medical Records and Genomics (eMERGE) Network, a consortium of US medical research institutions.

Under a $25 million grant from the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health, the eMERGE Network seeks to demonstrate that genomic information linked to disease characteristics and symptoms in EMRs can be used to improve care. The first phase of research, which ran from September 2007 to July 2011, demonstrated that disease characteristics and genetic information found in EMRs can be useful in large genetic studies. Thus far, the eMERGE Network has identified genetic variants associated with dementia, cataracts, HDL cholesterol, peripheral artery disease, white blood cell count, type 2 diabetes, and cardiac conduction defects.

In the second phase, which is slated to be completed in July 2015, researchers are using genomewide association studies across the entire eMERGE network to identify genetic variants associated with 40 more disease characteristics and symptoms. Information obtained from the DNA analysis of about 55,800 participants in each study will be used in clinical care. For example, with appropriate patient consent, researchers may use information about genetic variants involved in drug response to adjust patient medications. Or, for patients with genetic variants associated with diseases such as diabetes or cardiovascular disease, researchers will intervene to prevent, diagnose, and/or treat the condition.

Researchers also are identifying best practices for integrating privacy protections in their research and clinical care, including development of a publically available model for consent language.

Rongling Li, MD, PhD, MPH, an epidemiologist in the division of genomic medicine at the NHGRI and the eMERGE Network director, notes that two workgroups have addressed privacy and consent issues during each phase of the project. Research also is being conducted on the reidentification of deidentified data and the development of a Web tool for risk analysis.

Much like other groups in the business of mining EHR data, the eMERGE Network has had to find ways to address the data integrity issues that can arise when information is compiled from multiple, disparate systems. “The diversity of electronic medical records systems in the eMERGE network results in significant challenges for accurately identifying clinical phenotypes,” Li says.

To resolve the challenge, the eMERGE Network established an informatics workgroup in phase 1 and a phenotyping workgroup in phase 2. The workgroups consist of medical informatics experts and representatives from each study site. Over the past six years, they have developed PheKB, a phenotyping tool, and eleMAP, a data harmonization tool, each of which is available free to the public.

Data and Best Practices
Ultimately, the EMR data mining being conducted at PCCI, UMHS, the emerge Network, and other initiatives will not only lower health care costs through improved population health, it also will resolve issues concerning standards, privacy, and consent that hinder broader use of patent data.

It’s a bit trial and error but, as Oliver says, “The most useful things will always float to the top over time.”

— Elizabeth S. Roop is a Tampa, Florida-based freelance writer specializing in health care and HIT.