Home  |   Subscribe  |   Resources  |   Reprints  |   Writers' Guidelines

September 2013

Algorithms and Clues
By Juliann Schaeffer
For The Record
Vol. 25 No. 12 P. 18

Natural language processing uses its interpretative powers to make sense of patient data.

Natural language processing (NLP), a technology that seeks to associate meaning with words through a combination of complex algorithms, has been working its way into the mainstream over the past 20 years (think of those shopping recommendations you see on Amazon.com or certain travel planning websites). Recently, improved accuracy has helped NLP make intriguing inroads into health care, from assisting physicians with decision support by culling information from EHRs to aiding researchers in finding patient cohorts for clinical trials.

“[NLP technologies] have come a long way since the early ’90s,” says Juergen Fritsch, chief scientist and a cofounder of M*Modal, noting that the technology’s improved accuracy rates are mainly due to the availability of massive datasets. “NLP products and technologies are based on statistical learning algorithms, so they basically get better with the more data they see. The more data you can throw at it, the more examples they can process, the more accurate it will get over time. Since we have been able to collect a lot of data with increased computing power and cloud-based deployments, etc, the technology has gotten much better over the years.”

In a sense, NLP is the ultimate detective. “The idea behind natural language processing is to find words in combination to understand how those words are used in a given thought, sentence, or paragraph and, from how those words are used, determine what the knowledge element is that the person who created the document is trying to convey,” says Reid Coleman, MD, chief medical informatics officer of evidence-based medicine for Nuance Communications.

Decipher Dynamics
Whereas any given clinician note may mention heart attack, Reid explains that NLP must go beyond the meaning of those two words and decipher what the clinician meant by the notation. “Did they mean the patient had one, a family [member] had one, there’s a history of one, or there’s no evidence of one?” Coleman says. “What NLP does is look not for a word but for a series of words used in a context and actually determine the intent of the person who created the document and the meaning of the document in a way that could be used to support decisions, to support communication with other clinicians, and to support functions such as billing.”

Essentially, NLP is used in health care to automatically create structured data from free text, explains Danielle Mowery, MS, a graduate student in the department of biomedical informatics at the University of Pittsburgh who is working with a team of researchers to determine how NLP technology can best be utilized.

According to Mowery, NLP most commonly is used to identify relevant clinical events, such as problems or treatments, in any number of clinical free-text reports, including discharge summaries, radiology reports, and pathology notes. After identifying this type of information, the technology then can encode and/or extract the information to be used elsewhere in the system “to support medical and clinical tasks, such as computer-assisted hospital bill coding, summarizing medical record documentation into active problem lists, reconciling patient medication lists, and generating patient care plans following discharge,” Mowery says.

While NLP has yet to see widespread use throughout health care, it’s slowly becoming more commonplace, with the potential for tremendous growth. Whether by pulling information from EHRs or clinician notes to improve patient care or by mining information for population health analysis, some experts believe NLP could be a game-changer for individual care as well as population health as a whole.

New Avenues for NLP
One particular area of NLP research relates to EHR phenotyping. “[It] involves defining patterns of observable characteristics or traits from patients, including combinations of symptoms, medication, procedures, and diagnoses recorded in the EHR,” says Son Doan, PhD, a programmer analyst in the University of California, San Diego’s (UCSD) division of biomedical informatics.

Identifying such information may benefit both clinical research and patient care. “Developing an NLP algorithm to identify a particular disease phenotype—for example, diabetes—opens up the possibility of identifying patient cohorts for retrospective studies or clinical trials, performing decision support at the point of care, and linking phenotypes to genotypes to target genetic causes of disease,” Doan says. “NLP is critical to successful EHR phenotyping because many of the observable characteristics of interest are described in textual reports rather than as structured data.”

According to Wendy Chapman, PhD, an associate professor in UCSD’s division of biomedical informatics, NLP also can help extract various types of information from EHRs, which contain seemingly countless numbers of clinical notes with information that’s needed not just by physicians but also others in the care continuum.

Clinicians, researchers, and administrators all need to extract information from EHR notes for various tasks, including patient handoff and care, comparative effectiveness and health services research, and quality improvement and reporting, Chapman says. However, manually going through notes to cull this information is time consuming and limited in scope. This is where NLP can be invaluable.

“NLP can extract targeted information from the notes, such as a family history of colon cancer or the presence of a mass on a chest radiograph,” Chapman explains. “That information can then be applied, for example, to guide users in locating patients eligible for clinical trials, to alert clinicians that a patient may have a new infection, to highlight relevant information for retrospective chart review, to summarize important events during a hospital visit, or to identify anomalies in care or concerning patterns or outbreaks in populations.”

But Fritsch worries that extracting information from clinical notes, and thus from its overall narrative, for example, to find particular disease patterns, may limit the information’s usability. “[Extracting information from an EHR is] like extracting a tooth: Once you extract a tooth, it’s useless—you can’t use it for anything anymore,” he says.

Fritsch would rather see the overall narrative remain in tact and instead be augmented with annotations. “More recently, people have started to apply NLP much more broadly, and while they’ve been culling information from the notes, we are now focusing on annotating the information in the notes,” he says.

By using NLP to annotate, a clinical note essentially is left as a whole, with certain markers added to make a note of a particular disease or body part mention, for example. “This really augments a patient’s narrative and preserves the identified clinical facts so that future questions can be answered in context,” Fritsch says. “Extracting the information really limits its usefulness.”

Mike Conway, PhD, a research fellow in UCSD’s division of biomedical informatics, says annotating notes with NLP technology can help researchers learn more about certain conditions and treatments. “Various NLP techniques can be used to automatically identify specific conditions such as syndromes, diseases, and treatments,” he says, adding that this typically is done through one of two approaches: rule based or machine-learning based (or perhaps a hybrid of both).

“Rule-based approaches are built on lexicon/dictionary lookup and customized rules,” Conway says. “Supervised machine-learning approaches rely on training data. That is, data—in our case conditions and treatments—are first manually labeled by human annotators then a variety of algorithms can be applied to ‘learn’ from the examples how to assign labels to unseen examples.”

As Conway explains, NLP can map, or encode, concepts from free-text clinical notes to standard terminologies such as ICD and RxNorm. From there, researchers, physicians, and pharmacists can use the encoded data from large datasets to learn about various pieces of information, including “symptoms that precede diagnosis of a particular disease or adverse events that occur from medications.”

Fritsch says NLP’s greatest potential may lie in Big Data analysis, specifically as it relates to population health. “Once you use NLP not just on one single note but across hundreds or hundreds of thousands of notes and then focus on certain conditions, then you can look at population health,” he says. For example, by examining a subset of patients with congestive heart failure then looking at how many of those were readmitted to the hospital within six months—and why—researchers can glean a wealth of information “just by looking at the distribution of the statistics over large amounts of data,” Fritsch says.

Previously, this type of analysis could be done only on structured EHR data, “but most of the information really sits in unstructured, narrative clinical notes,” Fritsch says. “But nowadays, with NLP technologies, you can really make sense out of these notes by performing these large-scale analytics tasks on them.”

According to Coleman, NLP’s greatest strength is in allowing physicians to do what they do best: tell a patient’s story. And the better NLP can decipher free text (or speech) in clinical notes, the more likely physicians will be to keep creating those narratives.

“Clinicians do their best job when they document with narrative,” Coleman says. “The purpose from a clinician point of view of the patient’s chart—be it electronic, be it paper—is to tell the patient’s story. And the purpose of any given note in that chart is to fill in missing information. When we do this with narrative, we do a much better job, and [previous research has shown] that narrative communicates much more useful information to the other clinicians who are taking care of the patients than simply a list of facts, [such as in] EMRs that use structured notes.”

NLP can assist physicians in decision support, in real time, while narrating a clinical note. After they dictate notes, such as a history and physical or a medication list, from a patient exam, the narrative goes through Nuance’s NLP system, where the information is encoded to produce feedback to the clinician. “For example, I’ve dictated a note about a lady in a nursing home who has a history of a skin infection called MRSA who now comes into the hospital with pneumonia,” Coleman says. “When I run that note through our system, it comes back with a prompt that [alerts me that] the odds are very high that this patient’s pneumonia is caused by that same organism, MRSA, because that’s a big problem with people who live in nursing homes.

“And if that’s the case, [the prompt will note that] the best treatment for this patient is a different antibiotic than what’s usually used for pneumonia,” he continues. “So it’s taken the information provided by the doctor, analyzed it, compared it with information in the computer, and gives feedback to the doctor about what the right way is to take care of the patient.”

The technology also can be used to analyze a particular clinical note and prompt the physician for more specificity, where necessary, which can make for a clearer patient record. “By giving physicians prompts to improve the treatment and to improve the record, we’re making the care of the patient better both in terms of providing the right treatment and communicating with the other caregivers so they’re making good decisions as well,” Coleman says.

How It Works
How does NLP technology put meaning to the words in clinical notes and how reliable is the data? Factors such as phonetics, sentence structure, and pragmatics all play a role, but in essence, it ain’t easy.

“Clinical notes are recorded using a variety of input methods, including type and voice, and are constructed using a variety of formats, prose, and discourse,” Chapman says. “For each report type, an NLP system can be trained to process free text, leveraging structures at the document level, sentence level, and word level to aggregate semantic information like patient problems, treatments, and tests with promising accuracy.”

Chapman notes that NLP systems integrating lexical knowledge with syntax, semantics, and discourse generally work better than systems that rely on only one of these knowledge types. “As humans, we use a number of different types of knowledge to make sense of what we hear and read,” she says. “For complex tasks, NLP systems also need to leverage a variety of types of knowledge.”

While phonetics, sentence structure, and semantics are all important, Fritsch says context also must be taken into account. “Part of this is what people call pragmatics—when you try to take into account other things you know about the context of this patient, but it really goes beyond that, too,” he says, noting that physician documentation preferences must be considered as well. “All of these things need to come together. NLP technology typically does that by combining all of these different knowledge sources and then combines it through statistical models to come up with a presentation of a meaning.”

Each factor can play a role in how NLP deciphers any given clinical note. NLP can work off text alone, according to Fritsch, so sound, or phonetics, isn’t always a factor. “If it does involve sound, then a good example would be whether somebody raises their voice or lowers their voice at the end of a sentence,” he says. “This makes a big difference in the meaning. The same wording could mean a question or a statement, depending on whether you raise your voice at the end of a sentence. That’s a big cue you need to understand in terms of interpreting a sentence.”

For sentence structure, Fritsch says NLP algorithms must take grammar into account. “NLP algorithms put a lot of emphasis on finding the verbs and subjects [in a sentence] so that you know a statement was about the patient,” he explains. “If you’re in the family history section of a note, for instance, and the doctor would dictate that the patient’s mother died at the age of 70 of a myocardial infarction, it’s not sufficient to identify the myocardial infarction and say, ‘We’ve found a heart attack here.’ You also have to understand that it’s about the mother, not the patient. So sentence structure helps in detecting that.

“For semantics, you would then attach the meaning of there not just being a heart attack but a heart attack of the mother as a combined statement, which involves semantic reasoning,” he adds.

Possibly most important, the technology must factor in the context, or pragmatics, of a given patient situation or clinical note. “Information recorded within each report type is documented with an assumed world situational context and domain knowledge. For instance, ‘The patient drinks occasionally’ refers to the patient’s alcohol consumption, although alcohol is not explicitly mentioned,” says Mowery, adding that understanding this type of pragmatic context can be extremely difficult for most NLP systems and continues to be an active research area for the NLP community.

Next, NLP algorithms attempt to interpret words and sentences based on these characteristics. It’s not always easy, particularly when it comes to determining intent. “There are multiple difficulties in discerning communicative intent from text, depending on genre, context, characteristics of the speaker, and so on,” Conway says. “We are facing this challenge in some of our current work on extracting health-related information from Twitter data, where we are interested in identifying Twitter users currently experiencing influenza symptoms.

“The use of simple keyword-based approaches is inadequate due to a number of factors, including the use of the word ‘fever’ to indicate enthusiasm for some phenomenon—for example, Bieber fever—use of sickness-related words to indicate psychological states—‘I’m sick of work at the moment’—and the frequent use of irony, jokes, and exaggeration in informal texts,” he continues. “In order to achieve reliable results, NLP tools are required to address all these issues.”

As for reliability, Fritsch says NLP isn’t 100% accurate—and there’s a good chance it never will be. “There’s just too much ambiguity in the way we humans talk, sometimes intentionally so,” he says.

However, that doesn’t mean NLP can’t benefit the health care system. “That’s why a lot of the effort, at least in our company, is focused on self-assessment, which relates to using the technology to also assign a confidence measure to say how confident the technology is that it understood the meaning of a sentence,” Fritsch says. “If we assume that we’re not going to be 100% accurate, then the best we can do is at least indicate how confident we are so that in subsequent steps we can then look at the confidence level and say if the confidence is below a certain percent, then we’re not going to use the information.”

According to Chapman, NLP needs only to match human review to be considered effective. “A useful sanity check on the accuracy of an NLP application is comparison against human review; the NLP application may get only 55% accuracy, but if a human performing the same task performs with 57% accuracy, then the NLP application may be as good as one could expect,” she says. “Studies involving human review often demonstrate difficulty for humans in understanding the content of clinical reports.”

While Fritsch says there are certain functions, such as extracting information and acting on it blindly without verification or review, that NLP never should be used for (at least in its current state), that doesn’t mean it can’t provide value. “A good example would be population health analysis,” he says. “Even if you had some errors on the NLP side, overall, over hundreds of thousands of patients, you would still see trends—such as when a certain condition leads to readmission more than another one—and you could then use that information to improve care and provide better care to patients even though on an individual basis there might be some omissions in how patients were analyzed.

“There’s definitely a lot of benefit; you just have to make sure you use the technology adequately, never assuming that it’s 100% accurate,” he adds. “What I always tell people is NLP becomes really powerful if you combine it with human validation. Then it … can be used to improve the efficiency of existing health care workflows.”

Overall, developing accurate and sufficient NLP systems is just the first of many steps to come, according to Chapman. “Applying NLP systems to real problems requires integration of NLP output with other data, such as lab test results, and with other applications, such as decision support applications,” she says. “And sometimes graphical interfaces are necessary to help clinicians or researchers use the NLP output for purposes such as searching through a complex patient record or viewing patterns in populations of patients.”

— Juliann Schaeffer is a freelance writer and editor based in Allentown, Pennsylvania.