Dependable Data Key to Better Care
By William Hogan, MS, MD
For The Record
Vol. 26 No. 11 P. 10
Every day, the health care system feeds its computers volumes of data. It's a widely shared vision that the industry can learn from this information to improve care, lower costs, and increase patient satisfaction. As the data multiply in size and variety, terms such as Big Data come into vogue and the enthusiasm for tapping into this knowledge base grows.
However, this excitement is well placed only if it's accompanied by an equal desire for great data. We can only arrive at truths about what improves care, lowers costs, and benefits patients if the data are true and relevant to the most pressing issues.
The Power of Standards
Besides being accurate and complete, great data are also standardized data. If two data sets from two different institutions are accurate and complete but nevertheless require several full-time employees working six to 12 months to integrate them because they are nonstandard, they are of limited use in leveraging the power of Big Data. If the industry is to move forward in achieving the vision of an intelligent health care system informed by a wealth of data, it's essential that great data can be combined easily with other great data to advance learning.
Unfortunately, health care data contain—in part and to varying degrees—glaring omissions, half-truths, untruths, and outright lies. Much has been written in the last 20 to 30 years about this unfortunate state of affairs and its underlying factors. For example, researchers have found that diagnosis codes in administrative data are unreliable tools when determining whether a patient has a particular disease.
Instead, researchers can gain accurate assessments of disease status only through painstakingly validated algorithms that use surrogate data such as laboratory tests, prescriptions, and clinical notes. According to National Institutes of Health (NIH) data, one institution found that it could identify only 75% of its diabetic patients using diagnostic codes while another organization discovered its data were missing stroke risk factors at a rate of 25%.
Another contributing factor to false and inaccurate data is the tendency of many patients and providers to withhold or alter data due to fears about losing insurance and confidentiality.
Lack of Specificity
As the industry welcomes more patient data from which to learn, it must take steps to ensure the information is not just bigger but also better. To do so will require a realignment of incentives, policy changes, and greater investments in basic and applied informatics research.
Fee-for-service payment models incentivize rich data about type of care rather than how or why it's being administered. While documenting medical necessity is a must, the requirement is both a blessing and a curse. It accumulates data on patient conditions that required intervention, but it can lead health care organizations to distort coding to increase reimbursement and lump them into diagnosis categories.
Coding diagnosis categories instead of actual diagnoses results in a massive loss of information about patient conditions. For example, ICD-9-CM code 729.1 indicates the patient has fibromyalgia or myositis or myofasciitis or sore muscles, or any of about 10 other conditions. This wide range of possibilities hinders attempts to learn which treatments work best for fibromyalgia. To start, how can patients with fibromyalgia be identified when the code may also be attached to weekend warriors who overexerted themselves at a family picnic?
Although the switch to ICD-10-CM adds greater specification to the categories, resulting in richer data, coding diagnosis categories still will not fully capture patient conditions. For example, ICD-10-CM has one code [E53.8] that refers to patients who have either vitamin B12, folate, biotin, or pantothenic acid deficiencies. Under these circumstances, how can academic health care systems hoping to better understand biotin deficiency be expected to properly identify patients with the disorder?
The Impetus Behind Better Data
Despite these holes in the industry's coding systems, several trends, including translational science, patient engagement, and the learning health care system movement, are increasing the incentives for better data. Translational science seeks to move research findings into practice in close to real time as opposed to the oft-cited, decades-long time period presently required. To optimize functionality, translation requires better data from patient care venues.
Patient engagement, which emphasizes patient-centered outcomes, requires that health care consumers be allowed to create and correct their own data. Since patients have the largest stake in their data being accurate, they figure to be willing partners in the endeavor. In fact, there are anecdotes of patients rejoicing at merely being able to correct their name and address (which, by the way, they have been able to do for their credit card and bank accounts for more than a decade). Now imagine their joy if they can correct medical record errors regarding their symptoms or past medical history.
Meanwhile, the learning health care system movement seeks to learn pervasively and constantly from past experience to inform future action. Doing so requires "instrumenting" the entire health care enterprise to generate trustworthy data from which to learn.
For these trends to be successful, health care policy changes are necessary to fully realign incentives for better data. To date, the Affordable Care Act merely expands fee-for-service coverage to the uninsured; it does not significantly alter the model, which must reward quality of care over volume of care. For the health care system to report quality, it requires vastly more and better data about patient conditions and treatment outcomes. Thus, health care reform is essential to moving the industry toward an environment constructed to produce richer data.
Lastly, achievement of this vision will require greater investments in basic and applied informatics research to discover the best methods to create and manage better data. For example, with respect to creating data, how can health care organizations produce better data at lower costs? According to the NIH, one report advocates for "self-documenting" encounters in which the technology witnesses the clinician-patient interaction and automatically creates all the necessary actionable data based on the exchange. Clearly, such futuristic visions are not possible with today's technology and know-how.
Short of this ideal, the industry must invest in informatics research to learn how best to standardize data, capture patient-reported outcomes reliably and accurately, and incorporate those data into clinical decision making. In addition, it must determine how the design of HIT systems, including EHRs, affects data capture and quality.
The goal of delivering high-quality care efficiently can be achieved by creating richer databases from which to learn. But it will take policy changes, incentive realignment, and basic and applied informatics research for this to become a reality. Ultimately, the dividends on these investments will result in patients living healthier, more productive lives.
— William Hogan, MS, MD, is director of biomedical informatics at the University of Florida's Clinical and Translational Science Institute and a professor in the department of health outcomes and policy in the College of Medicine.