Home  |   Subscribe  |   Resources  |   Reprints  |   Writers' Guidelines

April 23, 2012

Identifying Patients in HIEs
By Julie Knudson
For The Record
Vol. 24 No. 8 P. 10

Master patient indexes maintain accurate records and provide up-to-date information to providers across the exchange.

Ask any healthcare provider and they’ll be quick tell you: Every patient is unique. Now consider creating new patient records on one of the busiest days of the year, comparing those new admissions to the records of all existing patients and removing records when duplicates are discovered. When you’re a health information exchange (HIE) and perhaps millions of patients stream through your providers’ doors every year, how do you keep track of all these unique individuals?

That’s where a master patient index (MPI) comes into play.

Many hospitals maintain their own MPIs, but an HIE handles patient identity management through a systemwide enterprise MPI (EMPI). “All of the patient’s identifying information, such as names, birth dates, Social Security numbers, addresses, and phone numbers, is stored in a central data table,” says Beth Just, MBA, RHIA, FHIMA, president and CEO of Just Associates, Inc, a healthcare data integration consulting firm.

The EMPI links all patient information, some of which is already connected to a patient identity number within an individual facility or system, to a unique identifier that spans the entire HIE. “The accuracy and completeness of that core information is the cornerstone of accurate patient identity,” Just says. A patient’s records, clinical information, and personal information must be linked to each individual’s unique identification number—and only one ID number can exist for each individual. The integrity and usefulness of the EMPI rests almost entirely on “whether or not records and clinical information is properly attached to the right patient’s medical record,” Just says.

Much of the accuracy of that information begins at check-in, says Karen Gallagher Grant, RHIA, CHP, enterprise director of health information services and chief privacy officer for Partners HealthCare in Wellesley, Massachusetts. Her team believes in a do-it-right-the-first-time approach, which led them to establish a process that gathers as much dependable information as possible when the patient initially enters the system.

“As we tried to identify patients enterprisewide, we came up with standards for how to register a patient,” Grant says. Because another goal is to get patients admitted quickly, a second layer exists to ferret out errors. “[Patients] are moving very fast,” Grant says, “so the other point is to make sure the data is reviewed from a quality improvement perspective.”

It may sound simple, but with critical patients being rushed from the emergency department to other areas of the hospital while others are being delivered via ambulance, finding the time and resources to gather the necessary information to create a new, accurate patient record or cross-check against existing records can be difficult.

According to Scott Afzal, program director at Maryland’s Chesapeake Regional Information System for Our Patients and principal at Audacious Inquiry LLC, a technology and management consulting firm, the primary challenge is that one of the main functions of an HIE is to enable access to clinical data across different facilities. “The core requirement is that there is an ability to match identities,” Afzal explains, adding that many of the facilities within an HIE likely don’t share the same medical record numbers.

The HIE must be able to match patient identities and link those unique medical numbers across the entire enterprise so that a medical record from one facility can be correlated to the record from another and everything can be accessed quickly. “This challenge grows as sources of information grow,” Afzal says.

Deterministic vs. Probabilistic
There are two basic approaches to accurately identifying patients and catching errors or duplicate records in HIEs.

An algorithm using a deterministic approach can match an individual’s information. “If you use that deterministic approach, it’s a byte-to-byte comparison,” says Lorraine Fernandes, RHIA, global healthcare ambassador of information governance at IBM. “There is no tolerance for human errors, typographical errors, or data-capture errors.”

In a deterministic system, nonmatching information, such as nicknames, incorrect birth dates, new home addresses, and maiden vs. married names, are likely to cause a rejection, potentially resulting in increased false-negatives and more duplicate records across the enterprise.

The second approach is a probabilistic algorithm, where the various pieces of information within a patient record are each assigned a weight, says Michael Sawczyn, security and privacy officer at the Ohio Health Information Partnership. Those weights are then used to score the likelihood that two or more records are actually the same person.

“You’ve got first name, last name, address, date of birth, and maybe a Social Security number if available, but you can’t count on any of them being correct,” Sawczyn says. The EMPI first looks for exact matches, and then “it tries for near matches, and you assign weights to those matches to determine the probability that you have this person in your system already,” he explains.

Deterministic technology has been around for a decade or more, but experts say it’s fading away in favor of more comprehensive probabilistic methodologies. Even newer algorithms are available with varying levels of sophistication and flexibility. Experts urge caution when making a selection.

“If you use less sophisticated algorithms in your record matching, you’re going to have potentially more false-positives,” Just says, resulting in a scenario in which records from multiple individuals are being merged into a single HIE record. “You want to avoid that at all costs.”

However, Just points out that even though many of the less sophisticated systems don’t carry the price tag of their more elaborately structured brethren, they often have built-in safeguards to ensure that specific criteria match before linking a transaction to the wrong patient’s record.

Most HIEs land on one methodology or another, often determining their approach based on the amount of resources—either financial or staff—available within the system. Or the methodology could be built into whichever platform is selected to manage the organization.

When an HIE issues a request for proposal for an exchange technology partner, Afzal says, “You’ll get a bunch of companies responding that have master patient indexing solutions built into their overall solution.” At that point, the HIE must consider its options and determine if the platform’s algorithms and approach to indexing are in line with its needs or if an independent EMPI solution would suit it better. Setup speed is often a factor in these situations. “A lot of HIEs need to show value quickly to try to drive toward sustainability,” Afzal says, which often leads to a preference for a model “that’s preintegrated and already working.”

Mixing technologies to get the best solution may be the answer for some HIEs. “You have to use all of the tools that are available,” Sawczyn says. “You can’t just say we’re going to do this and not this.” The Ohio partnership utilizes a platform that first tries to create a deterministic match and then resorts to probabilistic algorithms if necessary. “We tell [the vendor] what we will accept as a match from a probability standpoint,” he says. That number will vary based on the HIE—some may be happy with 94% certainty while others lean toward 98% certainty—but ultimately the goal is to limit the false-negatives without incurring increased false-positives. “We’re in the process right now of investigating the algorithms and coming up with recommendations for our board’s HIE committee to determine what those thresholds are going to be.”

Experts advise that it would be a mistake for HIEs to think that master patient indexing is an automated process or that any particular reconciliation method removes the human component. Healthcare professionals are responsible for evaluating the most difficult cases of mistaken identity and potential duplicate records and applying a higher level of thinking when the EMPI system hits a snag. “I’m always marching to the data integrity component,” says Grant, whose team continuously monitors matching rules to determine whether algorithms can be tweaked. “It’s like artificial intelligence that will help us in problem solving, but the devil’s in the details and you want to make sure you’re looking at data integrity at all times.”

As HIEs continue to add patients and providers to the mix, burgeoning data sets will make data integrity more of a struggle. “The bigger the database of records, typically the bigger the duplication problem,” Just says, adding that if an HIE starts with 400,000 records but that number grows to 4 million, the potential deluge of duplicates could be problematic unless it’s managed aggressively. And while algorithm technology is necessary as a baseline, she says, “It’s not the entire solution to the data governance and data stewardship problem.”

Just cautions that data-capture policies coupled with stringent monitoring of the information coursing through the HIE are needed to ensure data integrity. She forecasts a continuing need for close human oversight in resolving discrepancies “because algorithms are still only as good as the data.”

Patient Records Grow No Moss
Once a medical record has been created and compared against others in the system, it isn’t set aside. Life goes on and people do things that are continually affecting the accuracy and completeness of the EMPI. They move, get married, adopt or drop a nickname, and sometimes really do forget their spouse’s birth date.

It’s this dynamic side of patient records that often causes duplicates. “The MPI is certainly a living thing,” says Kris Joshi, vice president of healthcare product strategy for Oracle. “It’s never static.” Any time a patient is added, there’s the potential that an existing record is overlooked. While duplicate records within an EMPI create a more cumbersome data set, Joshi says it’s important to remember the effect goes beyond ones and zeroes on a computer screen.

“If you have two different records for the same person, one representing the patient’s history and the other without that history, and a physician brings up the no-history record, it could have a consequence on the patient’s treatment,” he notes, adding that an EMPI should be viewed as an active service in which data quality is paramount.

Capturing information accurately during admission is the first line of defense, but people do sometimes make mistakes. Whether a preoccupied parent mistakenly gives the admissions desk their child’s nickname or a receptionist inadvertently transposes a birth date on a busy night, catching, correcting, and distributing updated information on a continual basis is critical.

As part of this ongoing battle, the Ohio partnership is in the process of creating a structure that sends reports back to the hospitals for clarification when potential inaccuracies are identified. “Once they make corrections, those corrections will then automatically flow back into the HIE,” Sawczyn says. “The MPI is then adjusted based upon that.”

The system maintains the accuracy of the data so that future queries will access the most up-to-date information for each patient.

— Julie Knudson is a freelance business writer based in Seattle.


Language Barriers
Maintaining an accurate master patient index can be challenging on several levels, so much so that even a health information exchange’s (HIE) location matters. “Language and localization are very big aspects of uncertainty,” says Kris Joshi, vice president of healthcare product strategy for Oracle. Depending on an HIE’s demographics, algorithms may be structured “around phonetic spelling and identification of names, not just based on the sound but sound plus spelling,” he says.

The details of the algorithm are highly dependent on which language(s) are involved, as names could sound different based on origin. Quirks such as an “e” at the end of a name could help determine if a patient is already in the system or if someone made a data-entry error.

Fine-tuning how an algorithm examines patient data could greatly limit or expand its effectiveness. “I’ll use myself as an example,” says Lorraine Fernandes, RHIA, global healthcare ambassador of information governance at IBM. “My last name ends in an ‘s.’ Ninety-nine percent of the world thinks my name ends in a ‘z’ and my husband is Spanish. But my husband is Portuguese, and Fernandes ends with an ‘s’ in the Portuguese culture. And my maiden name is Grunewaldt, which no one could spell. These are everyday challenges that sophisticated algorithms handle.”

A strong, hands-on approach to managing how an HIE’s algorithms treat and weigh these types of language- and culture-related data sets is often a determining factor in how quickly duplicates are snagged.

— JK


Why Not Use Social Security Numbers?
For the ill informed, it may appear as if the answer to how to properly identify individual patients resides in wallets and pocketbooks: Social Security numbers. Not so fast, say experts.

“The core answer [to why they aren’t used] is because the Social Security number has become more synonymous with financial data, and tying it to other sensitive information about an individual could just pose a greater risk if that single number were compromised,” says Scott Afzal, program director at Maryland’s Chesapeake Regional Information System for Our Patients and principal at Audacious Inquiry LLC.

The law doesn’t often require patients to provide their Social Security numbers to hospitals, but when they do submit it, it’s often not the entire number, Afzal says. “In the cases when we get only the last four [digits], we’ll still use it,” he notes. “We can compare it to a full nine if we have it on file, but we assign less weight to it.”

Still, when at least a partial Social Security number is available, it can be a tremendous help in reconciling potential duplicate records as part of a larger data set, says Beth Just, MBA, RHIA, FHIMA, president and CEO of Just Associates, Inc. “The accuracy of the last four digits of the Social Security number coupled with the rest of the patient’s demographic data is huge,” she says, adding it can mean a win in a health information exchange’s quest for accurate identification while addressing patient privacy concerns. “It really increases the probability of getting the right patient, and the patient doesn’t have to give their full Social Security number.”

— JK