For The Record - Vive La Voice

Home | Subscribe | Resources | Reprints | Writers' Guidelines

July 22, 2009

Vive La Voice
By Alice Shepherd
For The Record
Vol. 21 No. 14 P. 24

Proponents of the technology say clinical documentation via speech recognition builds a more complete patient record and helps drive EHR adoption.

In general, most physicians love EHRs for their ability to present information in an easy-to-access format. However, the idea of inputting patient information and being turned into data entry clerks doesn’t sit too well. Another reason frequently given for poor EHR adoption rates is that the templates and drop-down menus drive the patient interaction rather than serve as tools. Physicians report that the prestructured responses and choices actually change or limit how they question patients. Their behavior mimics the technology rather than having it flow naturally to capture the most actionable information for diagnosis and treatment.

Potential Solutions
To get around these problems, some are singing the praises of speech recognition technology because it allows physicians to interview patients in their customary manner and then dictate reports in free-form narrative. Two basic types of speech recognition technology are available: real-time (front-end) and background (back-end) systems.

Front-end speech recognition occurs in real time during the documentation process. The physician logs into the EHR system or opens a Microsoft Word document and dictates into a microphone or a headset. As he or she is speaking, the words the speech recognizer hears appear on the screen instantaneously. When the dictation is finished, the physician reviews the words on the screen, makes any corrections, signs it, and files it in the EHR.

“Front-end speech recognition is a one-step process: dictate, review, sign,” says Keith Belton, senior director of product marketing for Nuance, maker of Dragon Medical and several other speech recognition technologies. “However, some physicians prefer to focus on patients rather than splitting their time between patient and computer. For them, there is background, or back-end, speech recognition. Physicians can see patients and then dictate into a traditional wall telephone, a digital recorder, or directly into the EHR, which captures the voice file and may put a marker in the EHR to alert the transcriptionist where the typed text should go. The voice file then runs through a speech recognizer, which produces a first-pass transcription. A transcriptionist edits the draft and sends it to the physician for review, correction, and signature before it is uploaded into the EHR. Background speech recognition improves the productivity of transcriptionists 50% to 100%.”

Front-End Speech Recognition — Even for Busy EDs
Miami Valley Hospital and two other hospitals of Premier Health Partners in the Dayton, Ohio, area are currently using Dragon Medical front-end speech recognition technology. Brian Zimmerman, MD, a physician in Miami Valley’s level 1 trauma center, says when Dragon Medical was integrated with the organization’s EHR, emergency department (ED) transcription costs went from $1.4 million per year to zero.

Unfortunately, a number of transcriptionists lost their jobs, but others were retained as chart readers to review the accuracy of physician documentation. “Errors can occur when physicians don’t talk directly into the microphone, don’t pay attention to what they’re saying, or just don’t talk clearly,” explains Zimmerman. “Transcriptionists identify the errors and return the document to the physician to correct and sign off. Corrections are then made by the physician in the electronic health record.”

Speech recognition is so accurate that Zimmerman can dictate 30 to 40 charts with only one or two errors. “The technology is ready for prime time,” he says. “We’re a busy emergency department, approaching about 100,000 visits a year, and we haven’t missed a beat with Dragon Medical.”

The efficiency of speech recognition can be further enhanced through the use of macros, or subroutines within the software that extract information from the EHR. “They utilize information that already resides in EHRs as much as possible,” says Zimmerman. “For instance, when I use the command ‘insert back template’ for a patient with a back injury, the speech recognition software calls up the requested template from the EHR and inserts it into the dictation. Using macros, I can prepopulate my dictation with the patient’s medical history, social history, family history, and the like.”

Zimmerman and a colleague served as “physician champions” to facilitate the integration of Dragon Medical within the Epic EHR. They designed a large number of macros, each with synonyms (eg, “insert back template,” “insert back strain template,” “insert lower back template”). “Regardless of which synonym the physician uses, the speech engine recognizes the macro and inserts the proper template from the EHR,” Zimmerman says.

They also created reminder cards that list the basic template names and commands to serve as handy reference guides for physicians. A subset of macros are “normals,” which instantly import normal elements into the document, such as physical exams and the like. “I can say ‘insert normal review of systems,’ and the software will type out a complete normal review of systems that has been preconfigured in that particular Dragon Medical command,” says Zimmerman.

After the successful rollout, several other hospitals asked for implementation help. Eventually, while still practicing full time in the ED, Zimmerman and his colleague started a consulting company, Physician Dragon Consultants, which helps other hospital systems and medical practices integrate Dragon Medical into their EHR systems.

Speech recognition technologies accommodate different physician dictation preferences. Like many of his colleagues, Zimmerman prefers not to dictate in front of patients, while others believe it provides an opportunity for patients to correct information before it goes into the record. “Some physicians in our group come out of the room after seeing a patient and dictate just the history of present illness section,” Zimmerman explains. “That helps them remember the patient when they return at the end of the day to complete the dictation and chart.”

Zimmerman dictates directly into the EHR’s note window using a handheld microphone or Bluetooth headset. Another option available with Dragon Medical is using a digital voice recorder, which plugs into a PC to have the voice files transcribed using speech recognition software.

“Physicians love speech recognition technology,” says Zimmerman. “It preserves their traditional method of documentation while making it much more efficient. We’ve dictated close to 300,000 charts in the ED. Premier Health Partners’ Good Samaritan Hospital in Dayton and the Atrium Medical Center in Middletown, Ohio, have followed our example.”

Productivity Gains
Prevea Health physicians utilize a multifaceted EHR that includes full dictation, partial dictation, and templating. Some physicians disliked EHR templates because they made all patient information look the same. “Most physicians like to have personalized information on their patients. Templates made reports look like a computer-generated program,” says Monica Van, Prevea’s health information manager.

At Prevea, physicians use eScription background speech recognition technology to dictate directly into the EHR system via headsets or digital handheld recorders when they are off site. Some dictate in front of patients and others from their offices. Dictations then travel to a voice recognizer in a WAV file (voice) and a metadata file (associated information). A transcriptionist edits the documentation, makes sure SmartLinks (similar to macros) are correctly linked to pertinent EHR information, and sends the document back to the physician, who then receives a notification that the transcript is ready for review and approval. The physician checks it, does any edits, and signs it. All physicians are required to review and approve documentation before it becomes part of the patient’s permanent health record.

Once the transcript is finalized and approved, eScription sends it to the EHR system using headers that tell the interface where to insert it. The final corrected transcript is also used to train the voice recognizer. “eScription has integrated wonderfully with our EHR,” says Van. “We are now exploring speech recognition for our diagnostic services.”

For transcriptionists, the workflow changes of automated transcription were significant. For one, typing and editing require two entirely different thought processes. At first, the transcriptionists were wary about working with the drafts generated by eScription, but now they have become “spoiled” as the documents have improved over time. “It’s amazing how the drafts come through with the correct punctuation, numbered lists, and paragraph breaks—things which are not dictated or spoken by the physicians,” says Van. “However, because the drafts never have a misspelled word, it’s up to the transcriptionists to catch words that are used incorrectly in context. Another challenge was for transcriptionists to feel productive without having their fingers flying over the keyboard. While their fingers are still placed on the keyboard, they only spring to action when an edit is necessary.”

eScription allowed Prevea Health to mimic its established workflow while increasing efficiency. Ninety-six percent of its work now goes through the speech system. “We used to have transcriptionists working 50 to 60 hours a week; now we have no overtime, and we no longer outsource any transcription services,” says Van. “In the one year since going live with eScription, we have increased our productivity by 133% and reduced our turnaround time from over five days to less than 24 hours. It’s been fabulous for morale, and our estimated three-year cost savings are going to be over $1 million.”

The Next Evolution
Speech recognition delivers significant productivity improvements and eliminates the problem of forcing physicians to think and question patients according to EHR templates and drop-down menus. However, the technology faces one major hurdle. “It cannot convert the free-form narrative produced by clinicians into structured information that can be data mined and queried by clinical systems,” says Nick van Terheyden, MD, chief medical officer at M*Modal. “This also means coding and billing cannot be done automatically. Coders and billers have to review and code charts in the traditional way.”

That’s where speech understanding comes in. It not only listens and transcribes, but it actually understands what the physician or clinician is saying and converts it into a structured document. It attaches semantically interoperable tags and values to the information so that computers can read it without human intervention.

“A traditional speech recognition engine chops the text into phonemes, the smallest distinctive constituents of language, and then reassembles it based on a grammar,” van Terheyden explains. “That’s not how people understand each other when they have a conversation. Actually, each person continually draws on additional information that is not included in the conversation but is needed for understanding. That’s essentially what speech understanding does. It doesn’t discard the information once it has understood the words but continues to concatenate the material. It integrates other information it ‘knows’ to contribute to a full understanding, such as previous reports on the patient, demographic information, and clinically relevant material.”

M*Modal’s speech understanding technology derives from the National Security Agency’s (NSA) work listening to radio signals and telephone conversations. “The NSA cannot influence the people to whom they’re listening to speak more clearly, use macros, or pronounce things a certain way,” says van Terheyden. “Our approach is based on the millennia of refinement that nature has applied to how we understand human language.”

While documents transcribed by speech recognition systems cannot be processed by coding engines, text produced with speech understanding is structured and tagged to facilitate coding. “Speech understanding bridges the chasm between the free-form narrative of dictation and the need for structured data,” says van Terheyden. “It generates output in clinical document architecture under the Healthstory [www.healthstory.com] standard for sharing information between multiple systems.”

A similar emerging technology is Northrop Grumman’s Natural Language Processing, which will review dictated, typed text and extract discrete elements. “When it becomes ready for the mainstream, it will revolutionize how coding and billing are done,” says Zimmerman. “It will negate the argument that speech recognition cannot produce documents with discrete elements.”

— Alice Shepherd is a southern California-based business-to-business journalist specializing in healthcare topics.

Don’t Scrimp on Sound Input Devices
Speech recognition technologies have the potential to provide huge savings in transcription costs, but cutting corners on microphones will decrease reliability and negate some of those benefits. To take full advantage of today’s accurate speech recognizers, experts say it is wise to invest in high-quality sound input devices. According to industry reports, healthcare organizations have had success with the following devices:

• Dragon Medical’s headset with noise cancellation technology;

• Nuance’s PowerMic2 digital handheld microphone, programmable to execute EHR commands;

• Olympus digital voice recorders and microphones;

• CyberAcoustics headsets and microphones; and

• Plantronics Bluetooth headsets.

For more information regarding the effectiveness and accuracy of microphones, visit http://support.nuance.com/compatibility/default.asp.