October 16,
2006
Physician
Persuasion: How to Sell Speech Recognition
By Robbi Hess
For The Record
Vol. 18 No. 21 P. 18
Doctors are known to be creatures of habit, adverse
to changes in their routines. However, it may be possible to convince
them that speech recognition can be beneficial.
Speech-recognition technology has the potential to reduce
the time physicians spend writing patient notes; more importantly, it
has the potential to reduce turnaround time from dictation to having
the note in hand.
Doctors need to know that speech-recognition technology
can improve patient care because—through the use of front-end
speech recognition—records can be available almost immediately.
Back End, Front End, or a Marriage
of the Two
According to Nuance Communications, Inc. Senior Advisor Don Fallati,
having the ability to provide a solution to accommodate individual physician
styles is key to integration and acceptance of speech recognition.
“A hybrid model for dictation is the way to go,”
he explains. “Giving the physician a choice on front end, back
end [background], or a combination of the two makes for an easier adoption
ratio. There will be a bell curve of users that aren’t so good
at the technology and those who may not be good candidates for speech
recognition in general.”
As with any technology, there will be those who readily
embrace it while others will balk at a change in routine.
“With some of Nuance’s core products, the
user can bypass transcription completely and that could be the primary
mode of dictation—they would edit their text themselves,”
Fallati says. “For others, it doesn’t make sense for them
to be doing their own editing and you would want that dictation sent
to the transcriptionist for completion.”
The choice is up to the physician, the facility, and
the HIM department. “When introducing a new technology, you need
to offer flexibility to suit various physician styles,” Fallati
says.
In front-end speech recognition, the provider dictates
into the system in which the words are displayed after they are spoken,
and the dictator is responsible for correcting misrecognition and signing
off on the document. In this system, the document never goes through
an editor.
Back-end speech recognition is the process by which
the clinician dictates into a digital dictation system, the voice is
routed through a speech-recognition machine, and the draft document
is routed— along with the original voice file—to a medical
transcription editor who verifies the accuracy of the draft, finalizes
the report, and forwards it for signature.
Lee Stephen, a programmer at Custom Speech USA, says
back-end speech recognition should be easy for any physician with dictation
skills. “The only difference is that the audio file is now transcribed
first by a machine, not by a person,” he explains. “Front-end
SR [speech recognition] is more difficult for some physicians [whose]
word error rate is high and many corrections are required. For other
physicians, where there is low error rate, there should be less resistance
to adoption of speech recognition. However, it should be remembered
that even a 5% error rate means an error about once every two sentences.
If there is a short report with a dozen sentences, that still means
that the physician may have to spend several minutes correcting the
errors. For a busy physician committed to patient care, that is still
a distraction.”
Stephen believes physicians are most concerned that
editing speech recognition errors will take time away from dealing with
patients and other medical issues. “Consequently, a server-based
system—where the speech recognition prepares a ‘rough draft’
for editing—and correction by a transcriptionist makes a lot of
sense,” he says.
Front-end speech recognition, Stephen says, is useful
for completion of short reports in STAT settings where there is minimal
correction required by the physician and/or there is a need for rapid
turnaround time.
“Where the report is longer or required turnaround
time is longer, there is less need for front-end speech recognition,”
he says. “Back-end speech recognition can be used, relying upon
the transcriptionist editor to create a polished draft after correcting
misrecognitions.”
Fallati says, from a physician perspective, front-end
speech recognition allows for self editing and that may be the best
option to offer. “But, the easiest first step for implementation
is the back-end model. It really doesn’t change the everyday dictation
methods a physician is accustomed to and all a hospital has to do is
bring in good transcriptionists,” he says. “Many times with
the back-end programs, the physician isn’t even aware that he
is utilizing a speech-recognition program.”
Front-end and background speech recognition have different
strengths, according to eScription Director of Marketing Lauren Richman.
“Front-end is well-suited for certain specialties
where physicians wish to do the editing of their documents themselves,”
she explains. “Background speech recognition is well-suited for
[an] enterprisewide application where the organization and its physicians
wish to increase transcription productivity and reduce costs overall,
with minimal change to clinicians’ behavior.”
Making the Sale
While physicians are increasingly being sold on speech recognition,
they still need to see “what’s in it for them” because
they view the technology from a different perspective than that of the
healthcare facility.
There are those clinicians who are clamoring for the
technology because vendors have told them it will replace transcription.
But those in the industry know that if there is a minute of extra time
and effort on the doctors’ part, they will balk at the technology.
There has to be a seamless transition for the technology to be accepted.
Even with the technology, Fallati says, a two-minute
dictation is still a two-minute dictation.
“What needs to happen is that software needs to
accommodate narrative detail in a more streamlined, contemporary fashion
than plain old dictation and typing,” he explains. “We can
embed data mining tools that can read narrative text, extract info,
and that will reduce the amount of time a physician spends in documentation.”
There are technologies available that allow a clinician
to speak a “trigger” word that could potentially generate
a paragraph of narrative. “A lot of doctors say, ‘My documentation
is very consistently similar,’ and with the new technologies,
they may be able to append a narrative piece on top of an existing template,”
Fallati says.
Discharge summaries, sometimes the bane of a physician’s
existence because they are literally redictating material that was done
previously, could be merged with the new speech-recognition technologies
which would reprocess the history, integrate it with the discharge summary,
and potentially save the physician hours of dictation time.
“You need to give flexibility on the user side
in order for adoption to be mastered,” Fallati explains. “You
can’t force-feed a one-size-fits-all system into a facility. The
physician has to be shown how these technologies benefit them and how
it will reduce their behind-the-scenes time when they are not with patients.”
Richman says physician reluctance often stems from a
concern for quality of care and an aversion to learning how to use a
new system. To sell physicians on the technology, she recommends demonstrating
accuracy, consistency, and speed when addressing quality-of-care issues.
“Physicians are often very busy and see the time
requirements of retraining unacceptable,” she says. “If
you are recommending a new system, develop a plan that makes learning
how to use the system as simple and as quick as possible.”
Back-end speech-recognition solutions can be installed
nearly transparently, Richman says. “Physicians can begin using
the new system without any retraining at all,” she notes.
Necessary Skills
According to Fallati, with the technology doctors are utilizing today,
there is probably not a lot of restructuring that would need to be accomplished.
What about those physicians who speak heavily accented
English? Fallati says that if their voice is difficult for a speech
program to recognize, it is likely it would be as difficult for a transcriptionist
to translate. “Truly, though, a heavily accented voice is no longer
the disqualifier that it might have been in the past. The technologies
are more adaptable,” he says.
The obvious benefits of speech-recognition technologies
are the streamlining of turnaround time and the flexibility that can
be built into the software. “The programs can be tailored to mesh
with specific disciplines and key words can ‘cut and paste’
phrases or paragraphs from one document to another,” Fallati explains.
“If a physician is given some ‘at the elbow’ training
time, they will likely be comfortable with the new technology and adapt
rather quickly.”
According to Fallati, radiologists are heavily into
using the front-end speech-recognition systems. “They are accustomed
to this technology,” he explains.
For back-end, server-based recognition, physicians can
employ standard dictation skills. Stephen says transcriptionist editors
need to be aware of the differences between machine and human transcription.
For example, a transcriptionist may make spelling errors, but the machine
never does. It always spells the word correctly, but the problem is
that it has recognized the wrong word; for example, transcribing “art”
instead of “heart.” In addition, these misrecognitions are
random.
“A back-end, server-based system needs a way to
display potential random errors and flag them for review by the editing
transcriptionist,” he explains. “One approach is to compare
synchronized output of different speech engines, making available the
audio-linked session files for each engine. The transcriptionist can
listen to the audio to determine whether there is a misrecognition that
needs correction. So the process emphasizes ‘word-check’
rather than ‘spell-check.’”
Richman says systems can be designed so physicians can
use them with little to no retraining.
“However, if a physician is looking to improve
his or her dictation—in order to improve the quality of the first
draft document produced by background speech recognition—the most
important skill that providers must learn is to organize their thoughts,”
she explains. “Speech recognition has advanced to the point where
as long as they don’t slur or mumble their words, physicians don’t
have to change the way they speak for the background speech recognizer
to understand them.”
Richman also says physicians should provide all the
demographic information at the beginning of the dictation and group
their information based on the formatting standards of the organization.
“Jumping back and forth between demographics,
diagnosis, medication, etc can reduce the quality of the draft produced
through background speech recognition and also slow down the review/editing
process by MTs [medical transcriptionists],” she says. “To
continually improve, physicians should review the finished documents
to see how their dictations have been changed and ordered to best understand
the documentation standards of their facility.”
In the End
It all comes down to physician habits and the disciplines in which they
work.
“Some disciplines are more conducive to speech
recognition,” Fallati says. “Surgeons, for example, tend
to walk out of the operating room, pick up a telephone, and start dictating
the notes of the procedure. It may not work for them to have to walk
down a hall, sit at a computer, and dictate notes. But others, like
physical therapists or another practice that isn’t well-supported
by the hospital’s infrastructure, are really showing readiness
for this technology. They are the ones right now who are lugging home
paperwork, dictating, editing notes themselves—speech-recognition
software would be beneficial for them.”
— Robbi Hess, a journalist for more than 20
years, is a writer/editor for a weekly newspaper and a monthly business
magazine in western New York.
Subscribe
to For the Record Magazine!
|