For The Record: Transcription Technology Wars

Home | Subscribe | Resources | Reprints | Writers' Guidelines

November 23, 2009

Transcription Technology Wars — Navigating the ASP/ASR/ARRA Battlefield
By Dale Kivi, MBA
For The Record
Vol. 21 No. 22 P. 10

Choose an ill-fitting business model and healthcare providers may become economic casualties, according to an industry insider.

Why have new equipment sales by previously dominant transcription technology leaders such as Dictaphone and Lanier all but faded off the face of the planet in favor of thin-client application service provider (ASP) and automatic speech recognition (ASR) offerings? And potentially more momentous, how has this evolution away from server-based technology helped drive a growing shift toward outsourced services that many believe has finally passed the 50% mark of total transcribed volume for U.S. healthcare providers?

But before jumping on the proverbial virtual technology bandwagon, exactly what do these latest ASP/ASR transcription platforms bring to the table? There may be dangers inherent with some of the new approaches that may actually threaten the integrity of the documentation process if not properly managed. Can they really help healthcare providers demonstrate “meaningful use” and therefore qualify for funding through the American Recovery and Reinvestment Act of 2009 (ARRA)? Everyone wants free government money for an IT upgrade, but those limited funds do no good if your ongoing costs explode and the quality of the documentation suffers.

Most of the sales pitches for technology-based solutions seem to begin and end with lower labor costs, faster turnaround times and, in some cases, a potential path to a slice of the ARRA pie. Are all ASP/ASR solutions really cheaper for the entire documentation production and distribution life cycle? And if some aspects of these schemes are indeed too good to be true, where do the problems show up, and how can the quality risks and complete life cycle costs be properly managed?

This article explores the ASP/ASR battle for transcription market supremacy and how it often pits HIM and transcription professionals against their HIT counterparts to drive lowered total solution cost expectations and the potential of ARRA funding.

Regardless of how similar the pitches may sound, all strategies and business models are not the same—and that’s where the true war is being waged. Place your bets on the wrong business model and you may indeed win the battle but lose the war. After all, once the government funding windows have closed, it will be how well your collective teams have come together to orchestrate sustainable cost, quality, and turnaround time improvements in support of patient care that will determine the true winners.

The Fall of Traditional Hardware Vendors
The widespread corporate mantra used to be, “No one ever got fired for buying from IBM.” Yet, the corporate giant abandoned its desktop computer product line (which, at least for a while, had been the overwhelming market share leader). Why? It wasn’t because IBM was no longer capable of providing reliable, technically competitive products but because its competitors were able to satisfy the market needs for a much cheaper price. IBM’s personal computer market share plummeted, and it eventually conceded the inevitable: Others could satisfy buyers at prices it simply could not compete with due to its inherent business model structure.

Server-based dictation/transcription technology vendors are now traveling a similar path. Vendors such as Dictaphone and Lanier have long offered competitive products with strong reputations. They firmly stand behind their equipment, introduce strong enhancements, boast dedicated and professional field representatives—the works. What could possibly threaten such a pair of entrenched market leaders?

Maybe it’s the proprietary hardware, software, service contracts, and planned obsolescence-based business models. These former dominant industry giants (along with a laundry list of lesser-known hardware-based vendors) are experiencing rapidly shrinking market shares in favor of competing ASP and ASR offerings. It’s not that the boxes no longer work; it’s that the server-based business model has become obsolete.

Case in point: Since the respective buyouts of Dictaphone and Lanier, their new parent companies have shifted their focus to ASP/ASR solutions. Nuance (owners of Dictaphone) features the eScription platform, while MedQuist (owners of Lanier) offers DocQment Enterprise. The most telling aspect of these new strategies is that both ASP/ASR platforms were purchased by the respective parent companies after they already owned the server-based businesses.

A possible reason behind such investments is that buyers grew tired of paying for field engineers to visit and install upgrades when they can now be downloaded over the Internet through the use of products such as PC Anywhere. (Not long ago, Dictaphone and Lanier both had hundreds of field reps across the country on their payrolls.) Or possibly it was the growing market rejection of the proprietary, premium-price branded hardware when generic Hewlett Packard, Dell, or other servers are more than capable of equivalent processing power and data storage (at more affordable prices and with more qualified field service representatives available than any transcription firm could dream of maintaining).

Such market-changing dynamics have made it obvious to more buyers that it no longer makes sense to spend up-front capital equipment money when ASP/ASR models offer parallel technology for about the same cost as the traditional servers’ ongoing annual service contracts (usually 20% of the overall system purchase price). Owning the technology adds no value. The ASP pay-as-you-go business model is cheaper and constantly maintained at the most current version of software the vendor has to offer and never expects additional capital equipment dollars from clients.

Rise of the ASP
ASP platforms for dictation/transcription applications first appeared on the market a little more than 10 years ago, with few direct healthcare providers or medical transcription service organizations (MTSOs) initially willing to sign on. Buyers were hesitant to jump on the virtual business model bandwagon for several reasons, including concerns over a lack of product maturity compared with the server-based options, fear of losing direct hands-on access to the servers that housed their data, and perceived Internet security risks.

At about the same time, a handful of transcription service vendors started experiencing tremendous business growth with expanding virtual transcription labor pools that were no longer restricted by local telephone calling zones for voice file access. In scaling these exponentially larger enterprises and consolidating the labor pools onto common platforms for operating efficiency, they soon realized their efforts to gain economies of scale did not apply to transcription server-based technology. In fact, it did not take long before such companies passed $1 million per year just for the service contract costs on their proprietary server-based products. (With traditional server-based configurations, this barrier is typically reached when labor pools approach 800 transcriptionists.)

As the growth of more MTSOs caused their owners to struggle with the economics of independent server configurations, it became clear that consolidating operations on integrated servers in a single data center would deliver the sought-after economies of scale and solve many of their daily labor problems (such as workload balancing for medical transcriptionists [MTs], enterprisewide reporting, etc). The ASP move also enabled some of them to displace their clients’ server-based technology for free (or nearly free), increasing their competitiveness and further reducing the server-based market share.

Consolidated data centers with enterprisewide workflow management and reporting capabilities in support of multiple accounts was the cornerstone of the ASP business model. And given the limited business growth of those early ASP vendors, some became acquisition targets for larger MTSOs (ie, MedQuist’s acquisition of SpeechMachines, now known as the DocQment Enterprise platform, and Spheris’ acquisition of Vianetta, now known as the Clarity platform—both ASP solutions).

Consequently, over the past five years, the market share balance has dramatically shifted away from proprietary server-based configurations and toward the ASP approach. Perpetuating this momentum, the growing list of ASP users has found the technology much more mature, with cost savings that can no longer be ignored. Especially for those healthcare providers who have been strapped for capital equipment cash while their traditional server-based products approached the end of their useful life cycles (if not sunsetted as products where service contracts are no longer even available), the ASP pay-as-you-go option has become the business model of the future.

ASR
Due to ASR’s promoted ability to reduce labor costs, demand for the technology is skyrocketing, while corresponding production cost expectations free-fall. Consequently, technology and outsourced vendors alike are engaged in a hotly contested battle for market share.

Front-end ASR (which typically requires dictators to change their speech patterns for how the engine listens to improve accuracy) or back-end ASR (which requires substantially greater processing power and file storage capacity but no dictator habit changes) are both available from a variety of well-established and start-up vendors. (The overwhelming majority of today’s ASR volume processed in the medical transcription industry is through back-end ASP applications.)

Regardless of the ASR flavor being used (front end, back end, server based, or ASP), it can easily be argued that all of today’s ASR software come in essentially one of two versions: preliminary or obsolete. Long touted by technologists as the eventual replacement for all human labor associated with transcription, ASR has witnessed the pace of product development continue to increase as the competition for market share escalates.

Given the technology’s lofty goals and competitive landscape, stand-alone software purchases as capital equipment investments don’t make sense for most buyers. Any version of ASR bought today will be obsolete long before the purchase price can be amortized, even if only over a three-year period.

Accordingly, any sales made for stand-alone ASR products usually means that the buyer has not done its homework. There are scores of ASP vendors out there that can deliver the same technology for less than the price of the stand-alone version combined with the service contract of the server-based box it is installed on—the exceptions being simple front-end applications such as Nuance’s DragonSpeak, which can be installed on any PC. But, as noted earlier, such front-end applications by definition require dictators to edit their own work and adjust their dictating habits to improve accuracy levels, something most doctors refuse to do.

The ASR Pitch
Speech recognition will cut labor costs at least in half as the team becomes editors instead of manual transcriptionists. Even better, some offerings can eliminate transcription staff completely because physicians can instantaneously (or nearly so) access ASR-generated drafts, edit their own work, and approve documents in one sitting. References are available to illustrate high adoption rates, where only a few of the physicians are still processed through manual transcription because their dictating habits or accents prevent them from being acceptable ASR candidates.

The ASR Catch
Nowhere on the transcription market battlefield is there a greater discrepancy between vendors for total cost, quality, and turnaround time performance than with ASR. And while you would expect that you should get what you pay for, the reality is not that straightforward. The desktop front-end software is considered well worth the investment by those physicians willing to edit their own work, but how can you measure the total documentation process impact of the high-volume, back-end solutions?

This is where the different business models and the subsequent provider’s access paths to the technology get interesting. It starts with the choice between the two paths to back-end ASR: technology vendors or service vendors.

Technology-only sales are traditionally made to providers with in-house departments. When legacy server-based products near the end of their life cycle, replacement boxes or ASP applications with integrated ASR are sought to keep the basic operations model intact. These purchases are often viewed as an acceptable compromise between HIM directors looking to save department jobs and HIT directors looking to deliver labor savings through applied technology.

Depending on the vendor, there may be up-front implementation costs, software licensing fees, per-dictator annual fees and, of course, ongoing volume fees. Some vendors have options for documents that don’t make sense to run through the ASR engine due to the expectations of low accuracy scores, while others have everything run through the ASR engine (where transcriptionists delete poor quality drafts and retype the documents from scratch anyway). The worst-case scenario is to maintain two systems—one for ASR and one for manual processing—as some technology costs are duplicated and reporting becomes cumbersome as the totals from the two systems have to be manually combined.

Service vendor ASR offerings are typically presented as discounted line rates for those documents that make sense to process through ASR. Often, the percentage of reports run through ASR for a service vendor will be somewhat less than the percentages boasted by pure technology players, as the service vendors have to maintain some profit margin through the editing stage. (If the service vendor’s combined cost for the ASR draft and editing is more than its costs to perform it manually, it will reroute work for that dictator accordingly.) Since pure technology vendors don’t pay for editing, they are less motivated to enable their customers to reroute such volume as it means a loss in ASR processing volume.

ASR Cost
Regardless of how ASR technology is accessed, it is important to boil down all vendor charges to verifiable annual volume rates before making a decision. Just as there had been issues with service vendors creatively calculating lines to offer low line rates while increasing the per-document costs, some technology vendors have done the same. Buyers should provide each potential vendor with the same set of sample reports and ask for their volume calculations for those specific documents. Depending on how volume measurement is defined, buyers may be surprised at the range of calculations received for the same documents. Once all measurement schemes have been calibrated against one another, the start-up fees, licensing costs, per-user annual fees, interface charges, etc should be added together and then divided by annual volumes to convert costs to a single per-line or per-visual black character volume rate. Only then can you compare apples to apples.

For example, Melissa, an HIM director at a large university hospital, sought an ASR technology solution to reduce in-house transcription staff costs that, including technology and employee benefits, had been calculated at 19.5 cents per line ($0.195). The result: Her labor costs were indeed cut in half as promised; however, after converting the cost of the new technology, the annual per-physician licensing fees, etc to a per-line value based on annual volumes, the total processing costs actually increased to 22.5 cents per line ($0.225). The lesson: Review the contracts carefully for volume calculation language and all add-on costs. Insist on independent verification of volume measurements with third-party software.

ASR Quality
A typical acute care report has 300 to 350 words per page. Even if the drafts average 98% accuracy (a goal any ASR vendor would be happy to maintain for a large population of diverse dictators), there would still be an average of six to seven errors per page. And unlike manual transcription workflow, where MTs leave a blank or look for assistance when something does not sound quite right, the ASR engines are programmed to generate output in complete, grammatically correct sentences. This means staff have to play “Where’s Waldo?” to find the errors, which can be difficult if not impossible to find without 100% sight-and-sound editing, since the engine presents drafts that read well but may not perfectly match what was dictated.

Mike Wilson, president of Cardus, Inc, an IT-enabled transcription and billing service provider, had one of his employees read the same medical report script into one of the ASR vendor’s platform 500 times. The result: The document was 100% accurate twice. The lesson: Even though documents generated by an ASR engine may look acceptable and lead to reduced costs, without 100% sight-and-sound editing by a trained medical language specialist, you will inevitably pass errors downstream.

ASR Turnaround Time
ASR processing is capable of converting voice to text faster than manual transcription. When calibrating expectations for ASR turnaround time performance, however, the editing phase of the process cannot be ignored. With the typical industry expectation being that manual transcriptionists will double their productivity after being converted to editors of speech-recognized documents, realistic targets should be minimally set at one half of the previous time frames (unless the dictators accept editing responsibilities for themselves). Such improvements can’t be achieved overnight, and you would be setting yourself up for failure with such expectations where there is a mix of manually and ASR-processed documents, which is almost always the case.

ARRA and Meaningful Use
ARRA assigns $17 billion in incentives and $2 billion for the establishment of regional health information exchanges to ensure meaningful use and interoperability of EHRs nationwide. To claim a share of these stimulus dollars, which will be granted as Medicare incentives, hospitals must demonstrate yet-to-be-detailed meaningful use of an EHR.
When President Obama launched his public discussion on healthcare reform and the stimulus bill, he talked a great deal about the Mayo Clinic and how it provides world-class care for 20% to 30% less than other parts of the country. It’s important to note that the clinic was not featured for its specific technology vendors, because its physicians are paid less (they’re not), or that the quality of care is compromised (it certainly is not). It was used as an example of how to manage the business of healthcare more efficiently and achieve desirable medical results less expensively.

In referring to the Mayo Clinic, the president traces a big part of the “total cost of care” savings back to how seamlessly and efficiently electronic health information is shared between consulting caregivers. And it is clearly this “business of healthcare” efficiency and interoperability that will serve as the blueprint for meaningful use.

This does not mean that all you will need to do is sign up with the EHR vendors used at the Mayo Clinic. The key to meaningful use and stimulus dollars is interoperability and how the efficient sharing of electronic records has proven to eliminate duplicate tests and other unnecessary expenses. No single platform guarantees qualification; it’s what you do with the electronic records once you have them in your EHR that matters.

Accordingly, when looking at your transcription technology options (or any other healthcare technology for that matter), providers need to pay special attention to interoperability. Does the platform play well with others, or does the vendor’s business model or technology restrictions make it difficult (or expensive) to exchange information?

Conclusion
As healthcare providers act to secure Medicare incentives, the transcription technology war promises to stay red hot throughout the coming year. To quote Shakespeare, “The better part of valor is discretion. Caution is preferable to rash bravery.” To emerge victorious on the ASP/ASR/ARRA battlefield, providers must stay focused on the big picture. The ARRA funding package is there to incentivize improved interoperability and reduce overall healthcare costs, not stimulate the economy with technology sales. Securing ARRA funds is pointless if documentation costs, quality, and turnaround time suffer.

Such victories can be achieved with today’s ASP and ASR technology. If providers employ in-house transcription staff, they need to use an ASR vendor that reduces labor and technology expenses after all add-on costs are considered. If outsourcing is the choice, providers need to be certain they are receiving discounted rates for what goes through ASR and ensure that all documents passing through the engine receive 100% sight-and-sound editing by qualified medical language specialists. If current vendors can’t meet these objectives, now’s the time to make a change.

In the end, the victors will be those who actually deliver reduced labor and technology costs over the complete documentation life cycle and qualify for ARRA dollars. To succeed in all three areas, HIM and IT directors must work together to critically evaluate cost, quality, and turnaround time variables. Those who fail to collaborate may achieve partial victories … but that’s the same as a loss.

— Dale Kivi, MBA, is director of business development for FutureNet Technologies Corporation.