Hidden Treasure: The Value of Unstructured Documentation

Home | Subscribe | Resources | Reprints | Writers' Guidelines

October 2018

Hidden Treasure: The Value of Unstructured Documentation
By Selena Chavis
For The Record
Vol. 30 No. 9 P. 10

Natural language processing helps health care organizations locate unseen discrete data.

The wealth of information that resides in unstructured EHR documentation and its potential to positively impact care delivery is not lost on the health care industry. For years, the opportunities available through natural language processing (NLP) to uncover the treasure within free text—currently comprising 80% of clinical documentation—have been the discussion of much industry fodder.

NLP turns unstructured documentation into shareable data that can be analyzed and acted upon. A recent Chilmark Research report, "Natural Language Processing: Unlocking the Potential of a Digital Healthcare Era," examined the current state of the industry as it pertains to NLP as well as offered recommendations for advancing implementation of these technologies. The report concluded that current drivers are expanding NLP application from traditional use cases around documentation and claims submission to encompass a much broader footprint across text and speech in support of population health and precision medicine.

"My personal experience with NLP in the past is that it's not a panacea. There are a lot of things it does really well and has for 20 years. It takes a long time for these things to develop to the level where they can be used in the clinical environment," says Brian Edwards, an associate analyst with Chilmark Research and one of the report's authors. Pointing to key NLP trends such as the use of speech and text in tandem with artificial intelligence (AI), he adds that "the convergence of different types of NLP—speech recognition and clinical documentation improvement (CDI) could save a physician many hours of a day's work, which is really relevant to [HIM] professionals," could be a game-changer.

Simon Beaulah, senior director of health care with Linguamatics, notes that NLP has traditionally powered reimbursement-centric initiatives or those related to clinical research in large academic medical centers. "Proven products in computer-assisted coding (CAC) and CDI have established NLP as a viable AI technology and widened its application to other areas," he explains. "For example, organizations pursuing value-based care objectives are using NLP along with population management analytics to achieve a 360-degree view of each patient."

According to Beaulah, opportunities to take advantage of NLP often fall into two camps: those that directly impact patient care and nonclinical areas such as quality measures, quality improvement, and prior authorization. "In both cases, NLP provides intelligence augmentation capabilities that significantly improve productivity while maintaining a human element for decision making," he notes.

Going forward, Mike Dow, senior director of product development at Health Catalyst, suggests that NLP will continue to grow in importance as health care data become increasingly unstructured. "In health care today, much of the info about a patient is not in the EHR. As we look to other data sources such as claims, mobile devices, or patient-reported data that are not structured, being able to understand the overall profile of patients will increasingly rely on the use of NLP," he says.

Evolving Use Cases
The Chilmark report describes a dozen significant NLP health care use cases, including CAC, speech recognition, and data mining. The authors point out that five of the solutions "have proven ROI [return on investment] and are commercially available from numerous well-established vendors," while "another four are going through the initial phase of the adoption cycle and are primed to have an immediate impact under the new value-based care paradigm."

Clinithink, an NLP platform showcased in the report, has found its niche in the report's mainstay category of automated registry reporting as well as the emerging area of clinical trial recruitment and the next-generation category of computational phenotyping and biomarker discovery. The potential impact of NLP on clinical trial recruitment is demonstrated in a study conducted with Mount Sinai's Icahn School of Medicine, where the company's NLP engine generated 10 times more patient matches in one-quarter of the time previously required.

On the phenotyping front, Clinithink, in collaboration with San Diego's Rady Children's Hospital, was awarded a Guinness World Record for the fastest screening of a newborn infant for rare diseases, completing the full genomic and phenomic analysis in 19.5 hours.

"We've looked at over 3 billion words or synonyms or related pairings of information to build up a phenotype library," says Sarah Beeby, Clinithink's senior vice president of life sciences. "We're able to process that whole population set in 24 hours, whereas a manual search could take a clinician three weeks. There is a time-saving perspective there … much more rapid understanding of rare disease aspects."

Health Fidelity's MedLEE NLP engine incorporates a comprehensive health care–specific ontology and rules engine to address the emerging use of risk adjustment and hierarchical condition categories. The report details a 2016 proof point in which UPMC Health Plan credits the solution for powering a $62 million increase in annual revenue. The engine has also been validated by researchers in peer-reviewed journals.

A significant and mature player in the NLP space, Linguamatics is making inroads in next-generation use cases that address text mining for life sciences, population health management, and precision medicine. Customers, including 18 of the 20 largest pharmaceutical companies along with payers and providers, report saving as much as 85% of the time spent conducting cohort discovery, according to the report.

Beaulah notes that resource-challenged HIT and analytics groups with mature EHR deployments and a value-based care focus are increasingly looking to NLP to create efficiencies. "Enabling NLP to integrate with other technologies is key, such as seamlessly working with the EHR, big data capabilities such as Hadoop, and analytics tools," he says.

Also profiled in the report, Health Catalyst announced a partnership with Regenstrief Institute in 2017 to offer the nDepth NLP engine featuring a data operating system that enables real-time analytics and workflow application development in a single platform and data stream. nDepth can be applied to most text-based use cases. The combined functionality is expected to power many current and next-generation use cases such as extracting discrete quantitative values from unstructured text and feeding these data into predictive analytics models.

"As the barrier to incorporate NLP has been lowered in recent years, we are seeing more embedding of NLP into additional product areas," Dow says, pointing to the recent embedding of NLP into Health Catalyst's patient safety monitoring product to track adverse events such as patient falls through primarily discrete data points. "It's not a product that relies on NLP but can benefit from embedding it."

Beeby, who believes the pace of NLP adoption is advancing rapidly, says the industry will continue to see expanded use across a variety of initiatives. "It seems to have moved from 'I must understand why I need it' to being part of strategic plans and important projects," she says. "In six months, the speed of the knowledge base and awareness about where data come from has [increased significantly]. Everyone seems to have an awareness of what data are there and the fact that we all need to have a mechanism to work with them and share insight."

Edwards believes one of the most exciting developments on the horizon relates to the use of ambient scribes to improve the EHR experience. These NLP applications operate similar to Amazon's Alexa or Google's Assistant, enabling a physician to use a smart speaker during a patient encounter to capture the conversation and automatically parse it out in a structured EHR format. "I think physicians would pay out of their pocket for the technology. Forget the IT department," Edwards says. "It's still a ways off—five years maybe. But [the industry] is definitely headed there."

Barriers to Successful Implementation
Anand Shroff, founder and chief development officer of Health Fidelity, says there are three key challenges organizations must address to pull off a successful NLP implementation. The first is the availability of data and the infrastructure to access them for the purposes of processing it through NLP. "Organizations get data from multiple data sources and many different EHR systems," he points out. "To be able to organize that data and then process them through NLP is not an easy task today."

Dow agrees, pointing out that the creation of the NLP data pipeline—the process of consuming text data, bringing it into an NLP solution, and providing results—is a complex endeavor for many health care organizations. "That data pipeline is a lot different than others we work with in health care," he says, adding that it's a larger dataset than most that must be pieced together from EHR systems. "It's a technical problem; it's solvable. It's not out of the realm of possibility."

Another challenge centers on organizational alignment around priority use cases and expected outcomes. "Unless there is agreement on which use cases deliver the most value to the organization, it is difficult to find long-term funding for the use of NLP in the enterprise," Shroff notes, pointing out that the most common initiatives focus on quality of care, documentation improvement, risk stratification, and adjustment cohort selection for clinical trials and outcomes research.

The third challenge is incorporating results into the organizational workflow. To combat this hurdle, Shroff suggests that organizations plan to integrate NLP-generated results into existing workflows.

He says a high-performance NLP engine can be measured by two main characteristics: precision and recall. "Precision relates to the ability of the NLP engine to detect clinical findings accurately. For example, if a patient record states that the patient suffers from diabetic neuropathy, and if the NLP engine only surfaces diabetes and not diabetic neuropathy, its precision is low," Shroff explains. "Recall relates to the ability of the NLP engine to detect as many clinical findings as possible. For example, an NLP engine may not be able to detect 25% of all clinical findings in a medical record, in which case its recall is low."

The best NLP engines, Shroff says, have extensive lexicons that include commonly used acronyms, shorthand, and other jargon. They also include grammar and disambiguation modules that allow them to successfully navigate different writing styles. "Recently, machine learning has allowed NLP engines to learn language idiosyncrasies faster and perform well even when the input is unfamiliar," Shroff says.

Best Practice Implementation Considerations
The Chilmark Research report offers the following recommendations for adopting and implementing NLP:

Conduct a thorough data audit to establish a baseline that can be used to help prioritize use cases and set expectations. If the data don't exist, there is no viable use case for NLP.
Identify mainstay NLP use cases such as CDI and CAC that are not being fully utilized, and close those gaps with commercial tools from NLP and EHR vendors.
Depending on expertise and need, either employ or develop gold standard datasets using historical data annotated by cross-disciplinary teams.
Identify emerging and nascent use cases and look to partner with NLP platform vendors to more fully develop their solution capabilities.
Consider uses for NLP that are both retrospective (mining unstructured data sources) and more on demand/real time and voice based to provide immediate value and store information in more structured formats for later use.
Press analytics vendors for inclusion of NLP in their solutions to process unstructured data.
Establish the proper metrics to ensure that NLP systems are performing optimally and not deteriorating over time as needs and data change.
Always give intelligent and strategic thought to how NLP is applied to workflows to ensure substantial gains in operational efficiency, productivity, and overall workforce morale.

In addition, Beaulah suggests targeting high-value application areas such as clinical decision support, risk adjustment, and prior authorization to secure the right backing for NLP projects. "Technology and clinical buy-in is clearly vital for implementation success, as is having cross-functional teams to assess the impact of the initiative and plan its rollout," he says. "It's also beneficial to start small and show early success so teams can see how the technology affects workflows."

Offering an example of a high-value project, Beaulah points to NLP in patient safety net initiatives, such as identifying high-risk patients who need follow-up for pulmonary nodules, PSA (prostate-specific antigen) levels, or colonoscopies. "It's impossible to manually perform such large-scale screenings in real time, but with automated NLP processes, we can quickly identify the right patients so that clinicians can initiate care that drives better outcomes," he says.

Dow emphasizes the importance of balancing case finding with human analysis. For instance, in the case of CAC, providers are using NLP to find what you need to bill. Once identified, the information is presented to a human reviewer for quality control.

For a patient safety initiative such as falls, the goal is to use NLP to find notes that might suggest a patient fall has occurred. There is still the need for a human to review that note and confirm whether the patient actually fell. "People do get a little bit nervous about using technology for complex things like understanding language," Dow notes. "NLP does case finding very well. Then humans come in to follow up after it."

The bottom line, according to Beeby, is that NLP is delivering a new level of understanding. "The insights of NLP against large data sets using real-world patient data is fascinating," she says. "We are seeing trends and interesting correlations that [were not previously available]. The capability is beyond anything we have seen to date."

— Selena Chavis is a Florida-based freelance journalist whose writing appears regularly in various trade and consumer publications, covering everything from corporate and managerial topics to health care and travel.