| May
15, 2006 Speaking
of Savings: Can Speech Recognition Deliver?
By Elizabeth S. Roop
For The Record
Vol. 18 No. 10 P. 24
Are the days
of wild expectations and unfulfilled guarantees in the past? Perhaps,
if organizations take the correct approach when implementing the technology.
When John Avedian first proposed implementing speech-recognition
technology to streamline the HIM department at Maine Medical Center
seven years ago, he had little by way of statistics or industry examples
to support his case. Instead, Avedian, the HIM director at the 606-bed
teaching facility and research center, needed to deploy a small pilot
project to support his claims that speech recognition would not only
work, but would result in significant cost savings for Maine’s
largest hospital.
“The goal of the pilot was, ‘Would it work?’
Would we see that the medical transcriptionist became more productive?
We had to validate that it would even work,” he says. Today, however,
“there are so many success stories with speech recognition at
this point, I don’t think a pilot is even necessary.”
Indeed, Maine Medical Center is one of those success
stories. In 2001, following the pilot program, Avedian began the implementation
of EditScript from eScription, Inc. to process dictations from the facility’s
1,100 physicians, and for editing the text files by in-house and remote
transcriptionists, as well as an outsourced transcription service.
Within three years of the enterprisewide rollout, Maine
Medical Center had realized an estimated $1 million in savings, including
a 74% productivity gain for the in-house staff that resulted in a reduction
in the use of outside transcription services. Today, approximately 81%
of the facility’s dictation—which produces approximately
10 million lines per year—is speech recognized.
“The reason we started this was because we really
believed that there was technology that could make the medical transcriptionist
more productive. We didn’t set out to replace the medical transcriptionist,
we set out to explore ways to help them become more productive,”
says Avedian. “As staff becomes more productive, you need less
staff and you send less work out. We’ve converted some of our
resulting vacancies from attrition by moving them to other areas of
the department where we’ve needed help. In addition, we’ve
been less susceptible to and less pressured by the shortage [of transcriptionists].”
Best of all, Avedian adds, the improvements are “continuous
because the speech recognition engine becomes better and the transcriptionists
become better. It’s an evolving thing.”
“It is important to view speech recognition as
a productivity improvement tool rather than a total eliminator of human
labor,” notes Dale Kivi, vice president of business development
at CyMed, Inc. “We had reduced annual department expenses for
some of our clients by more than $1 million prior to having a speech
recognition option, so a portion of the savings experienced by providers
who switch from a large in-house direct labor approach to a speech-recognition
approach need to be credited to the inherent process improvements that
come with such end-to-end technology solutions.”
CyMed is the largest medical transcription service organization
(MTSO) to have incorporated the Nuance Dragon Speech Recognition engine
as an option under their outsourced service offerings. “In our
approach, it’s all about providing a solution that is better,
faster, and cheaper than what has been done—and speech-recognition
technology is clearly capable of delivering quantifiable results,”
Kivi says.
Significant
Savings Potential
The market’s enthusiasm for speech-recognition technology is mirrored
by a growing portion of the HIM population. A 2004 KLAS survey of HIM
directors and managers showed that 26% were already using speech recognition.
Further, according to the 2005 HIMSS Leadership Survey, another 59%
of health information technology (HIT) executives plan to implement
the technology within the next two years.
The reason for speech recognition’s growing popularity
among HIT executives is simple: The potential for savings is enormous.
According to the Medical Transcription Industry Association
(MTIA), medical transcription is a $22 billion industry with more than
$6 billion spent on labor alone. “If speech recognition can improve
productivity by even a modest 25%, a $1.5 billion savings could be realized,”
says Kerry Waltrip, senior vice president of product and sales strategy
with SoftMed Systems, Inc., which offers a wide range of document creation,
management, and distribution solutions.
As an example, he points to one SoftMed client that
was able to eliminate the need for transcription in its radiology department.
“As a result, the facility is now saving approximately $28,000
a month in transcription costs,” Waltrip says.
Savings aren’t just realized in reduced hard costs;
increased productivity also comes into play, according to Kulmeet Singh,
senior director of healthcare strategies for Nuance Communications,
Inc., which provides speech and imaging solutions and recently completed
the acquisition of Dictaphone Corporation.
Speech recognition “will erode transcription costs
when it is used as a tool to improve productivity,” he says. “For
some physicians, speech recognition can double or even triple productivity.
In general, a reasonable first-year target is that speech recognition
will improve productivity enough to erode 20% of transcription costs.
When speech recognition is used by physicians—granted only some
physicians will use it—the transcription costs can be reduced
more significantly.”
Hospitals typically realize savings from speech recognition
in three key areas:
• reduced or eliminated need for outsourced transcription
services;
• less paid per line to an outsourced transcription
service because it’s faster to edit than to type; and
• reduced in-house labor costs.
Other benefits include faster turnaround times, increased
consistency of medical documentation, and centralization and standardization
of medical records processes.
“An enterprisewide back-end speech-recognition
system increases the productivity of medical transcriptionists by producing
high-quality first draft documents that the medical transcriptionists
subsequently edit instead of typing them from scratch,” says eScription
CEO Ben Chigier.
It’s important to note that the savings potential
is also tied to the type of speech-recognition solution. Front-end solutions—which
are popular in radiology, pathology, cardiology, physical therapy, and
emergency departments—allow providers to create reports instantaneously
with transcription from audio to text in real time, with the physician
editing the results and completing the document in one step.
Back-end, or deferred, speech recognition creates reports
from audio that are then edited by medical transcriptionists. Back-end
solutions, which are transparent to the provider, deliver ready-for-edit
reports to the transcriptionists for validation and formatting.
“Front-end speech offers the biggest impact. Real-time
completion of reports by the dictating physician in one step can generate
an 80% or higher reduction in the cost and turnaround time of the report,”
says Joe Desiderio, chief strategy officer of Voicebrook, which provides
front-end speech recognition integrated directly with a number of information
systems in a variety of report-intensive specialties. “However,
this requires a change in process for the physician. Deferred speech
recognition delivers a lower payback [and] requires change in the transcription
process, but less change to the physician.”
When choosing which version of speech recognition best
fits their needs, healthcare organizations must keep in mind who will
be using the technology.
“Whether a client approaches us with an interest
in front-end or back-end speech recognition, our experience has been
that their ultimate decision is driven by the amount of change their
physicians are willing to accept,” notes Kivi. “To be effective,
front-end solutions require the physicians to dictate in the manner
that the speech recognition engine listens, then directly edit anything
that was not processed accurately. Some physicians love it and can completely
eliminate the transcriptionists from their process with this approach—as
long as the physicians also embrace the role of editor. Other physicians
simply won’t change their dictating habits to become efficient
front-end candidates, let alone take on the responsibility of manually
editing their own reports. Similarly with back-end solutions, it is
where the editing responsibility lands that drives the buying decision
and out-of-pocket process cost. Most of our physicians prefer that we
handle that end of things so they can remain focused on patient care.”
Evaluating
the Need
Speech recognition is already prevalent in certain areas—particularly
radiology and pathology. In fact, Desiderio estimates that 10% to 15%
of radiology practices and 20% of pathology practices are currently
using speech recognition, with 25% and 20%, respectively, currently
evaluating a solution for implementation within the year.
However, any healthcare organization could potentially
benefit from implementing some form of speech recognition. As a rule
of thumb, Singh suggests that “if transcription costs are $3,000
per physician FTE [full-time employee] or more per year, speech recognition
must be considered.”
And, according to Waltrip, any facility with physicians/dictators
who record at least 60 minutes of dictation per month are good candidates
for deferred speech recognition, as are “facilities that fail
to meet their documentation turnaround time targets, those that employ
medical transcriptionists but find themselves still needing to outsource,
facilities that forecast increasing dictation volume, and those that
have difficulty recruiting and retaining qualified transcriptionists.”
Specifically, when evaluating whether speech-recognition
technology is a worthwhile investment, organizations need to evaluate
their current direct costs, including transcription costs, time spent
handling charts and storage costs, as well as indirect costs such as
the impact real-time information can have on patient care, according
to Desiderio.
“Aggressive focus on workflow can and should make
physicians more productive, not less, so factor physician productivity
into the investment analysis,” he adds.
Chigier suggests that the first thing a facility should
do when evaluating the potential of speech recognition is articulate
the organization’s goals so they can be communicated not only
internally, but also to potential vendors. Is the organization expecting
speech recognition to decrease turnaround time, improve quality, or
save costs? Is the goal to enable the transcription department to take
on additional work, or to streamline processes or reduce/increase the
number of outside medical transcription services?
“An enterprisewide speech-recognition system can
help an organization achieve all of these goals, but it is important
for both the organization and the vendor to understand the goals and
the priorities of those goals,” Chigier says. “It is also
helpful to look at other organizations that are using speech recognition
and examine their results. Understanding the benefits from peers can
provide good insight.”
Costs
vs. Return
The actual cost of a speech-recognition system depends on a number of
factors, including facility size, number of users, vendors, and whether
the solution is hosted, client deployed, or a hybrid of the two. Training
costs also vary, particularly between front-end and back-end technologies,
and the approach to training.
“Expectations for front-end speech are not in
sync with what it takes to properly implement and support it. The speech
recognition engine is fairly inexpensive; however, the main drivers
of a solution’s cost are workflow, integration, training and support,
as well as management tools for larger environments,” says Desiderio.
“Organizations considering these solutions should make sure their
physicians’ needs are met, and should also budget for incremental
user support and software maintenance annually.”
For example, a single copy of one speech-recognition
software for medical users is advertised for less than $900 out of the
box, or less than $3,000 in the first year including training, support,
and maintenance. Others place the cost at $4,000 to $10,000 per user
in the first year, with lower costs in subsequent years when only ongoing
support is needed.
“These up-front technology costs are the potential
Trojan horse for providers looking to deliver positive financial results
during their current fiscal year,” notes Kivi. “If the client
is going to replace their technology anyway, then it makes sense. On
the other hand, if your productivity improves by 25% or so across the
board during the first year—which many might consider optimistic—you
could easily end up spending more for the total process even if you
deliver substantial savings on the labor side. From our perspective,
that’s the advantage of relying on an outsourced agency to engage
speech recognition on a selective basis when you know it will deliver
positive results. This business model circumvents the provider’s
up-front technology costs since it is provided by the outsourced agency.
This approach also shifts the responsibility for delivering improved
savings down the road to the agency as the technology improves.”
However, the return on investment for a properly selected
and implemented system can be significant—some facilities report
annual savings of $250,000 and higher—and rapid, positive return
on investment (ROI) is often realized within the first year. In fact,
according to Waltrip, deferred speech-recognition systems can have an
almost immediate impact followed by steady increases throughout the
first six months.
“Factors impacting ROI include executive-level
sponsorship, success of change management, and a full understanding
of the cost of traditional transcription at the facility,” he
says. “Depending on the dictation volume, [deferred speech recognition]
results become available within 30 to 60 days. Instantaneous speech
recognition allows for the elimination of some transcription costs so
results will be achieved virtually immediately.”
Other factors that have a direct impact on ROI include
the following:
• transcription costs;
• number of FTE transcriptionists;
• pricing levels from outsourced transcription
services;
• productivity improvements; and
• percentage of all dictations that are edited.
The latter two are critical to measuring ROI, according
to Chigier. “For instance, if the productivity improvement is
100%, but this only happens to 5% of the documents, there is not much
gain,” he says. “Likewise, there is little gain if 100%
of the documentation can be edited, but the productivity improvement
is only 5%. So these numbers work closely together to paint a picture
of what the real ROI will be.”
Finally, it’s important to include indirect costs
in ROI calculations, such as the number of days in accounts receivable
(AR). “If AR days are high due to delayed transcription, this
can be an important indirect cost to consider,” says Singh, adding
that it’s wise to factor the cost of both the speech-recognition
and transcription systems into the equation because “speech recognition
deployments are not successful if they’re not tied to a good transcription
system.”
Maximizing
the Investment
Realizing the full financial benefits of speech recognition requires
a long-term commitment to technology and training among physicians,
when appropriate, and on the executive team, to overcome early resistance
to the system.
“There’s a reason why nobody has an ‘I
heart change’ bumper sticker; nobody likes to change how they’ve
always done it,” Desiderio says. “However, with minimal
change, cost and turnaround improvements are compelling. There has been
some overpromising and underdelivering with speech recognition in the
past. However, for an organization that is interested, the technology
is now mature enough that it can be implemented in the right environments.
Besides the technology readiness, it’s critical to have strong
management and a set of representative users who can demonstrate the
effectiveness of a proper solution.”
It’s also important to evaluate the system’s
ongoing performance, says Chigier.
Among the metrics to measure are the following:
• decreased turnaround time;
• cost savings/cost avoidance;
• productivity gains;
• yield (the percentage of transcription work
that is processed by the system);
• document quality;
• document consistency in headings and body copy;
and
• clinician satisfaction.
In addition to monitoring system performance, which
will provide ongoing validation of the system’s effect on costs
and productivity, initial resistance can often be eliminated or reduced
by carefully matching the solution to all users, providing appropriate
ongoing training, and sharing success stories from other organizations.
For example, Maine Medical Center opted for a back-end
system because physicians would not be affected and also because the
goal was to enhance productivity, not replace medical transcriptionists,
says Avedian.
The facility also acknowledged that implementing speech
recognition would impact workflow and require a change in the way transcriptionists
performed their jobs. It would require not only initial training, but
also continuous retraining to ensure the system would gain maximum results.
“It’s a long-term investment. This isn’t
just an application; it revolutionizes how you do your medical transcription
work,” says Avedian. “You can’t just install and leave
it. It requires workflow analysis, training and retraining, and a commitment
from people to do their work differently.”
As long as the commitment is there from every level—including
the executive team and the IT department—implementing a speech-recognition
system can be successful.
“When I’m reducing my outside transcription
budget by half year after year, there isn’t a lot for anyone to
have difficulty with,” Avedian says. “Once you’ve
committed yourself financially, technologically, and emotionally to
do this, you’ll see the benefits. It’s here. It’s
a viable solution. Learn about it; embrace it. Don’t run away
from it.”
— Elizabeth
S. Roop is a Tampa, Fla.-based freelance writer specializing in healthcare
and health information technology.
Subscribe to For
the Record Magazine!
|