Home  |   Subscribe  |   Resources  |   Reprints  |   Writers' Guidelines

March 28, 2011

Extraction Experts
By Selena Chavis
For The Record
Vol. 23 No. 6 P. 14

Healthcare organizations that probe below the surface can find a bounty of riches contained within their secondary EHR data.

The secondary use of patient data as a strategy to enhance the delivery of care is not a new concept. Many healthcare organizations have been using unidentifiable healthcare data to boost quality and patient care initiatives for years.

But with the uptick of EHR use across the nation combined with expectations for the rapid evolvement of HIT in the meaningful use era, experts suggest that the benefits of data mining and the secondary use of data will become more accessible as data become aggregated electronically. Many experts also believe this movement will heighten the potential scope of how the data can be used.

“The ability to mine data becomes much easier when the data is electronic,” explains Deven McGraw, director of the Health Privacy Project with the Center for Democracy and Technology. “Researchers are enthusiastic about getting clinical data in electronic form. The richness of the data is perceived to become greater.”

According to a white paper from the American Medical Informatics Association, the secondary use of health data refers to the nondirect use of personal health information (PHI), including but not limited to analysis, research, quality and safety measurement, public health, payment, provider certification or accreditation, and marketing and other business activities.

“Secondary use of health data can enhance healthcare experiences for individuals, expand knowledge about disease and appropriate treatments, strengthen understanding about the effectiveness and efficiency of our healthcare systems, support public health and security goals, and aid businesses in meeting the needs of their customers,” says David Dorr, MD, MS, an associate professor in the department of medical informatics and clinical epidemiology at Oregon Health and Sciences University (OHSU) School of Medicine.

For example, consider the following scenarios:

• OHSU has developed an Integrated Care Coordination Information System (ICCIS) in which data from six different primary care clinics, covering four different EHRs, are compared. Dorr says the system was designed to better understand how population management systems can improve care for at-risk individuals, especially those with multiple chronic illnesses.

“Organizations can look at aggregated data and drill down to their own,” Dorr says, noting that peer comparisons often produce a higher degree of quality. “They can actually figure out where improvements are possible.”

• John Tempesco, chief marketing officer with the Informatics Corporation of America, notes that Vanderbilt University Medical Center (VUMC) was able to increase compliance among patients with diabetes from 45% to more than 90% in eight months through the secondary use of health data. The organization developed a dashboard that identifies patient compliance efforts and monitors VUMC’s effectiveness at following up on patient behavior.

“It would show information such as a patient’s need to have a foot exam,” says Tempesco, noting how VUMC extracted important data out of its EHR and made them usable. “The question for organizations is ‘How do I make actionable information out of this plethora of data that can bring change to a patient?’”

• Kaiser Permanente (KP) started using data-mining strategies well before the advent of the EHR, according to Amy Compton Phillips, MD, associate executive director for quality for The Permanente Federation, who adds that electronic data have allowed the organization to implement strategies for improving quality that are much more complex.

For instance, she points out that when a new medical device is introduced to the market, KP can identify its effectiveness on different patient populations and diagnoses. “The data is effective for decision support,” she says. “It’s been revolutionary to our ability to more rapidly improve the care we provide to patients.”

While the obvious benefits and potential for the secondary use of healthcare data is a no-brainer for most healthcare professionals, the use of this information raises several ethical considerations, with privacy being at the top of the list.

“There is always a struggle between privacy of information and the utility of information,” says Tempesco. “As unidentifiable as they make it, it’s still your data. When does patient privacy outweigh the overall healthcare of a community?”

Balancing the Scale
Much of the controversy surrounding the secondary use of EHR data revolves around the definition and accountability associated with using unidentifiable data. As defined by HIPAA, unidentifiable data are completely stripped of statistics that could point to an individual, including identifiers such as name, age, and birth date, as well as health information that could be reasonably expected to allow individual identification.

As long as healthcare organizations use data in a way that they cannot be reidentified, there is no legal reason for permission to be obtained from a patient. However, some watch groups believe the accountability in this area is lacking on the federal level.

“There’s a lot of question about whether the standards for identifiable data are robust enough for appropriate protection,” McGraw explains, adding that while most organizations will say they cannot be reidentified, there is not a strong enough legal foundation of consequences and penalties to enforce appropriate compliance. “The fact is that there is a market for identifiable data.”

McGraw points to numerous computer science studies that have been able to tie “unidentifiable data” back to individuals, exposing vulnerabilities in the system. A study conducted in the late 1990s by Latanya Sweeney, a computer science professor at Carnegie Mellon University, was able to pin “unidentified” health information back to William Weld, the governor of Massachusetts at that time.

While only gender, zip code, and birth date were provided, it was discovered that the information was unique enough to eliminate 87% of the U.S. population. Couple that drill-down to other public record data sources, such as a voter registration database, and the information was readily tied back to the individual.

While not related to PHI, a similar study was conducted recently by two University of Texas at Austin scientists in relation to an “anonymous” contest conducted by Netflix. When it was determined that the anonymous information could, in fact, be reidentified, the company discontinued the contest after reaching an agreement with Federal Trade Commission investigators.

Dorr says the healthcare industry needs to protect consumer privacy rights while also acknowledging that the quality improvement activity produced by the secondary use of EHR data is a crucial component to improving care.

“The general tenure from the lay press is that unauthorized release of this information is a big risk. We need to be focused on policies and procedures that will safeguard data,” he says, pointing out that some uses going forward may be more controversial than others, including genomics, where genetic profiling takes place. “When we start to go out further, there are some things society should be concerned about.”

Key questions to ponder, according to McGraw, include: What sources of secondary uses should we be promoting? Who determines accessibility? To what extent can the public be involved in how those decisions are made?

“Secondary data use tends to get a black eye, but it’s not hard to think of some of the benefits and opportunities that are potentially out there,” McGraw notes, adding that a better system will help ensure the opportunities are realized. “There’s a growing recognition that we need some better rules. … Lots of advocacy groups are calling for a better way forward, but no one has stepped up to the plate yet.”

Data Reliability
For organizations moving ahead with data-mining initiatives to improve quality, one key area of concern is data reliability.

According to Dorr, the data from each clinic within OHSU have to be validated to ensure accuracy. He acknowledges that the actual consistent use of standard vocabularies across the organization has been variable.

“However, use of standards is increasing,” he points out. “At the start, none of the clinics had medications in the RxNorm standard, and by the end, three of six will, mainly due to meaningful use requirements.”

Noting that a lot of data stored in an EHR are still found in text, he suggests there will need to be checks and balances in place to locate the source of the data and determine how they were obtained.

“It requires people to look back in a chart on at least a subset [of data components] to validate,” he says. “Until they [healthcare organizations] have gone through validation that looks at workflow … there’s a lag as to the benefit of the information.”

McGraw notes that data integrity often centers on the specificity or high-level nature of the question posed. In other words, sometimes the information required has to be very precise.

“They [healthcare organizations] have to have internal processes and feedback loops that go directly back to the patient to ensure the data is correct,” she says.

In fact, many healthcare professionals suggest that giving some of the responsibility back to the patient could go a long way toward minimizing errors. “Having patients play a role in correcting data could be monumental,” McGraw says.

Citing cancer patient Dave deBronkart, a well-known blogger and advocate for participatory medicine, as an example, McGraw suggests that the more patients are involved in their care, the better the process will be going forward. Specifically, deBronkart made the choice to transfer his personal health data into the Google Health PHR system from Beth Israel Deaconess Medical Center, the facility where he was being treated.

He found that the data contained many errors, including a false medication warning, diagnoses that were exaggerated, and erroneous conditions. Alongside errors, important information was missing, including lab results, radiology reports, and a list of allergies.

Technology can also play a crucial role in helping to flag potential errors, Compton Phillips notes, adding that KP’s EHR provides decision-support tools that help identify data that appear out of the normal range for a particular identifier.

The Question of Resources
Of course, the ability to pull off a successful quality initiative using data-mining techniques requires resources. McGraw says the organizations with superior resources historically do a better job.

According to Dorr, even though OHSU has a full clinical informatics department with more than a dozen full-time employees dedicated to data validation, the process is still “not as good as it needs to be.”

“It takes a lot of resources to do it right,” he acknowledges, adding that the secondary use of EHR data often demands more data entry to be able to pull information from a more thorough record. “Clinician data entry is crucial. There’s often a tension that develops between clinicians [who don’t have time for extra data entry] and research groups who want better data.”

Tempesco notes that having an informatics-trained individual who also has acquired a solid clinical foundation is crucial to making certain the right information is captured. Determining what questions need to be asked in order to receive answers that will support a data-mining initiative will increase an organization’s success rate in making meaningful change that affects delivery of care.

The right technological structure has to be in place to afford an organization the ability to capture the right information. With so much information contained in an EHR, Tempesco points out that it’s not always easy to aggregate data that will make a difference.

“The important thing is to be able to mine data in multiple formats and get the right results,” he points out.

And there’s nothing like strong leadership from the top to support a system-level view of the process, according to Compton Phillips, who adds that capturing such a perspective is challenging in today’s fee-for-service environment.

“What sometimes gets lost in the conversation is that this process will really benefit patients,” she says. “It’s a set of unbiased knowledge that really helps generate data that will make personalized medicine more achievable.”

— Selena Chavis is a Florida-based freelance journalist whose writing appears regularly in various trade and consumer publications.