Home | Subscribe | Resources | Reprints | Writers' Guidelines

E-News Exclusive

Realizing the Value of Unstructured Data: Five Ways to Leverage NLP to Boost Outcomes and the Bottom Line

By Brian Levy, MD

There is a lot of industry discussion underway about the promise of natural language processing (NLP). Increasingly recognized as a powerful tool for unlocking vital clinical data, NLP turns unstructured documentation into shareable data that can be analyzed and acted upon.

The reality is that a wealth of patient information currently resides in free text that could significantly improve the completeness and accuracy of quality improvement initiatives. However, much of these data are missing from present-day analytics strategies because infrastructures lack the capabilities needed to extract key information from free text. This missing link is problematic for quality improvement since, by some industry estimates, unstructured data account for as much as 80% of clinical documentation.

The industry has made notable inroads with structured data exchange through the introduction of standards such as HL7, SNOMED CT, LOINC, and RxNorm, as well as advances with ICD-10. The good news is that advances in NLP can now help round out these efforts by enabling the retrieval and sharing of critical unstructured patient information.

In truth, NLP has broad application. For many industry stakeholders, the question is, “How can I leverage NLP from a practical standpoint to impact outcomes and the bottom line?”

Much of reporting and analytics today centers around patient cohorts—essentially defined as groups of patients sharing specific characteristics. Diabetes, a focal point of industry quality initiatives, is a good example. The following patient characteristics may be used to define a diabetic patient cohort: elevated HbA1c lab value, use of medications such as metformin, and symptoms such as increased thirst and hunger.

Many reasons exist for defining patient cohorts, such as quality measures reporting, disease management and population health initiatives, and submitting patient information to disease registries. However, the success of any of these efforts rests with a health care organization’s ability to accurately and completely identify all patients with the predefined attributes within a cohort. Otherwise, strategies run the risk of falling short of their desired impact due to limited data.

In truth, health care organizations can realize significant return on investment by leveraging NLP within patient cohort strategies across a variety of domains. The following are five practical tactics for leveraging NLP to improve both outcomes and the bottom line.

• Quality measures reporting. The accuracy of quality measures reporting plays a key role in reimbursement for providers and reputation for both providers and payers. Notably, one study found that EHR-derived quality measures—where only structured data are analyzed—can undercount practice performance when compared with a manual review of electronic charts.

One example of a missed patient reporting opportunity in the payer and provider segments is the quality measure in the Physician Quality Reporting System (PQRS) 116 (NQF 58). This category will have lower performance scores when patients receive antibiotics for acute bronchitis, since evidence suggests that this approach to treatment does not improve the condition and may cause harm. Yet, exclusion criteria exist within the parameters of this measure for patients who have a secondary condition, such as cystic fibrosis or HIV. Often, documentation demonstrating the secondary diagnoses is found in free text as opposed to structured areas of the EHR. Without NLP-enabled data mining, providers have no way of identifying patients who fit these criteria without manually combing charts. NLP, in this case, improves accuracy, helping a health care organization calculate higher scores and avoid negative payment.

A second example of a missed opportunity: health Insurers also report quality measures through the Healthcare Effectiveness Data and Information Set program. The results impact the insurer’s public Centers for Medicare & Medicaid Services star ratings and, subsequently, their reimbursement. These quality measures also include inclusion and exclusion criteria for conditions such as acute bronchitis (discussed above for the PQRS measure). It is critical that payers find the optimum number of patients for reporting, requiring that they have access to provider data from EHRs. This data pull must include information found in free text notes as well. NLP can reduce the amount of manual chart abstraction needed to identify appropriate patients as well as identify the exclusion criteria.

• Disease management/population health initiatives. Value-based care demands that providers and payers elevate care management strategies to better address chronic disease and population health. Yet, how can industry stakeholders adequately address a full patient cohort (eg, diabetes) if they cannot identify all patients that fall within these parameters?

Even with complex, well-documented diagnoses such as diabetes, key indicators are often missed. Eye and feet exams are prime examples. While these data are valuable for determining the severity of a diabetic patient’s health, documentation of these exam orders does not always show up in structured EHR text.

According to one survey, the scale of the problem increases with certain complex conditions. While the majority of practices participating in the study accurately recorded hypertension and diabetes more than 80% percent of the time, rates of appropriate documentation for dyslipidemia and ischemic cardiovascular disease were substantially lower. As such, NLP enables aggregation of critical unstructured patient data to reflect performance across chronic conditions and complex diagnoses to ensure more accurate representations of patient populations.

• Patient experience. Today’s patients want to engage more in their care and feel empowered by their choices. Providers and payers can help them make the best decisions for managing their health by offering targeted communication and education to meet their needs.

For instance, providers must already aggregate data related to smoking status and elevated body mass index for quality measures reporting. The same patient cohort defining these populations of patients can also be leveraged to ensure patients are notified of opportunities to improve their health through smoking cessation, weight loss, and nutritional programs. Similarly, providers can design a patient cohort for compassionate outreach offering hospice services to individuals needing end-of-life care. NLP ensures that patient cohorts addressing these areas are complete, so no one falls through the cracks.

• Clinical decision support. Providers can more fully leverage the promise of clinical decision support by proactively using NLP to extract information from free text to help trigger alerts. For example, identifying patients who are at risk for sepsis can dramatically improve mortality rates.

Using NLP to extract information from the clinical notes—especially signs and symptoms such as fever, chills, and confusion along with lab values and vital signs—can help providers identify patients at risk for sepsis. Decision support can then be applied at the point of care to recommend the proper treatment, including fluid replacement, antibiotics, and blood pressure support.

• Clinical documentation improvement. Hierarchical Condition Categories (HCCs) form the basis of risk adjustment models used to determine reimbursement for various Medicare plans. Clinical documentation is critical to calculating these scores as patients with more severe illnesses may qualify for additional reimbursement and health insurance premiums can be adjusted correctly.

Providers naturally want to ensure all documentation related to severity is identified and used. NLP helps drive clinical documentation improvement efforts by fully identifying patient severity as well as gaps in clinical documentation. For example, rather than just documenting that the patient has “depression, unspecified,” the added choice for level of severity (mild, moderate, or severe) will place the patient into an HCC.

Infrastructures That Best Deliver on the NLP Promise
Without the right infrastructure in place to address both structured and unstructured patient data, patients are often excluded from patient cohort analytics. Terminology and data management solutions exist where NLP works in tandem with structured data processes to extract, normalize, and map needed data to appropriate industry standards. This framework ensures analytics initiatives related to patient cohorts are accurate and complete.

— Brian Levy, MD, is vice president of global clinical operations and product management for Wolters Kluwer, Health Language.