Challenge FAQs

Public Notice

Patient Matching Algorithm Challenge Winners

Best “F-score” (a measure of accuracy that factors in both precision and recall):

  • First Place ($25,000): Vynca
  • Second Place ($20,000): PICSURE
  • Third Place ($15,000): Information Softworks

Best First Run ($5,000): Information Softworks

Best Recall ($5,000): PICSURE

Best Precision ($5,000): Ocuvera

Each winner employed widely different methods. PICSURE used an algorithm based on the Fellegi-Sunter (1969) method for probabilistic record matching and performed a significant amount of manual review. Vynca used a stacked model that combined the predictions of eight different models. They reported that they manually reviewed less than .001 percent of the records. Although Information Softworks also used a Fellegi-Sunter-based enterprise master patient index (EMPI) system with some additional tuning, they also reported extremely limited manual review.




1. What is the Challenge expected to accomplish?

The goal of the Patient Matching Algorithm Challenge is to bring about greater transparency and data on the performance of existing patient matching algorithms, spur the adoption of performance metrics for patient data matching algorithm vendors, and positively impact other aspects of patient matching such as deduplication and linking to clinical data.

2. How can interested parties learn more about the challenge

The challenge website is the best source of information for this challenge and will be updated frequently and can sign-up for email updates.. ONC will also host informational webinars for prospective participants beginning at 4pm ET on May 10, May 17 and May 24.

3. Is this challenge part of a grant award program?

This competition is not a grant.  It is a prize competition run under the authority of the America COMPETES Act, which enables ONC to invest in innovation through research and development.  Challenges are a way for HHS employees to draw on external talent and ideas to solve critical problems.

4. I want to participate in this challenge, when can I register.

Registration will open on May 10th immediately following the Informational Webinar. All interested in participating in this challenge are encouraged to attend one of the 3 informational webinars as we will go over registration details. Participants will create a username and password upon registration.

5. When are project submissions due?

Submissions are due no later than 11:59pm EST on the last day of the submission deadline (date TBD). This date will be posted to the Challenge website. Those who have registered to participate in the challenge will also receive an email with this due date.

6. What is the amount of prize money available for this challenge?

The total Prize Purse for this challenge is $75,000. Up to 6 prizes will be awarded for:

  • First Place: $25,000
  • Second Place: $20,000
  • Third Place $15,000
  • Best in Category Supplemental Prizes ($5,000 ea.)
  1. precision
  2. recall
  3. best first F-Score run

7. What is an “f-score” and how will it be scored for this challenge?

Matching algorithms can make two types of errors.  The first error is the failure to find a matching pair (often referred to as a “false negative”), which is measured by “inverse recall” in the field of information retrieval. The second type of error is a record that is matched when it should not be (often referred to as a “false positive”), which measured by a metric known as “precision.”  The weighted average of precision and recall generates the final metric pertinent to this Challenge is known as “F-Score.”  For the purposes of scoring, all scores will be assessed to three places after the decimal (i.e.,

8. When will winners be notified?

Winners will be notified one week from the submission deadline.  This date will be posted to the Challenge website. Those who have registered to participate in the challenge will also receive an email with this due date.

9. What is in scope for this Challenge?

The scope of this challenge is limited to the data set provided by ONC to challenge participants.


Answered During the Webinar

Will the MRN be specified on all records, and if so, then can we expect it to be unique or not ?
There will be some missing and incomplete values. I will have to get back with you to confirm. But it will be in the data dictionary. The Enterprise number will be unique.

Will we get a copy of these slides?
Slides will be posted to the website We will work to move them to the front page of the challenge site.

How many people are allowed to be in the same team?
We have not specified a set limit. It is up to the team to figure out.

Can you estimate the length of time that it takes to submit an algorithm and get a score?
You are not submitting the algorithm, you are submitting your list of matches. When you submit the scores, the answers will be done quickly within minutes.

What if a later submission is worse than an earlier one - do you count the earlier one or the later one?
For the submission, we will take the best F-Score regardless if it is an earlier submission. For the first run, we are taking your first submission.

How many entries can a registrant submit?
You can submit up to a 100 times.

What matching performance metrics will be used to evaluate match results, and how will those metrics be used to determine a winner?
The metrics are precision and recall into an F-Score. The team with the highest F-Score is the winner.

How was the gold standard established for the test training data?
It is outlined on the slides. You start with a database. You get potential matches which is done through human review who look to see if the records are the same.

Can you give us any insight into how you are intending to use the confidence score in the submitted answer keys, if included?
It’s just an additional datapoint. It’s optional. It just gives us insight into the different algorithm performance.

Will ONC describe the clinical context from which the test/training data derive? (e.g., newborns, cancer patients, public health case reporting, etc.)? Clinical context is important to inform algorithm selection.
These cases are de-duping an ambulatory or hospital database where there are multiple matches or individuals in a single database. Hopefully, it is a robust enough use case to serve as a starting point for this type of challenge.

Are the records derived from a _linkage_ use case (derived from two different dataset), or a _de-duplication_ used case (derived from a single dataset)?
It is the de-duplication from the single dataset. There is no schema matching. You are just looking for the dupes.

Are there any constraints on the type of algorithm used?
Use your best most creative solution.

Will the algorithm published for public consumption? e.g., will the algortihm be shared under open source?

How large is the dataset?
One million records.

Is it possible the winner won't even show up on the leaderboard? It sounds like you are saying submitting to the leaderboard is optional.
No, that isn't possible. The leaderboard is updated automatically when submissions are scored.

What match fields, and how many match fields are included in the dataset?

Can you give us an example of what would be in the alias column?
Any previous name associated with a patient ... can have first and last names ... could be a legal name, nickname, previous married name, maiden name, VIP/Alias, etc.

So our submissions are the linked data only and not our developed algorithm,?
Yes, submissions are a response file containing linkages produced by your algorithm(s). You are not submitting your algorithm.

Aren’t we in EDT not EST time zone in June? Suggest you just say Eastern Time Zone?
Thank you. We will correct that on the website and slide deck to just say Eastern Time Zone.

What about match time?
Although important, algorithm efficiency is not a metric in this challenge, which is focused on maximizing accuracy metrics. The use case in this instance does not focus on time.

How will the three metrics be used to determine a winner? (e.g., is F-score the ultimate tie breaker?)
The F-Score is a metric by itself. The highest scores win.

Will scores during the Beta period count towards the award/against the 100 allowed submissions?
We will follow up on that. But most likely they will.

Is this the 1st time for this challenge?

I am team lead and a US citizen. Other team members are UK citizens. Are they not eligible?
We will double check. But we believe that they can participate. However, ONC cannot award them money.

Is the challenge only based on the samples in the database provided?

Will a "complete" (non-null) record include a de-identified SSN *and* a synthetic SSN? Or can we expect to receive a value in both of the two fields for the same record?
Question is not understood.

Are you dictating how many fields should match?

Are there any rules against using existing algorithms to contribute to our own? For example using soundex for phonetic analysis?
Nothing would preclude building on previous work (assuming you have the right to use that work). So, to answer your direct question, nothing would stop you from using a publicly available algorithm like SOUNDEX or one of its variants as a building block.

So are we going get the criteria being used?
No not detailed criteria. But we can provide a high overview.

Does ONC have any rights to use of winning algorithm?
No. It is your algorithm. ONC wants to encourage continuous improvements through common metrics.

Will the gold standard data be released after the competition?

Are the synthesized data fields related to each other across categories and drawn from actual data? For example, will zip codes be located within the actual state listed or is it just a random number?
We will follow up on this. If the question is about dependency of different fields, the answer is yes. It is not random. It is designed to look like real world data.

Can the organizations involved participate in the challenge?

Can any more specific assumptions be inferred about the synthesized data other than that it is representative of the healthcare domain?
I wouldn’t infer much into it. Rather just try to understand what matches look like.

When is the dataset available to download?
June 12th.  We will email registrants and update the challenge website.

Any indication on the percentage of the data having links? Not to give away too much but is it 50%, 25% or less?
We don’t want to give any hints.

What about ties?
The website does explain the rules that.

Currently I'm a team of 1, can I add more team members after I've registered?
Yes. But you will have to add them to the roster.

So what will ultimately determine the truth about the linkages submitted?
The linkages will be determined by looking at duplicates with a manual review.

Can we assume the synthesized data is from a single source system or should we consider the corpus to be merged from multiple sources?
In terms of the schema, that is not going to change. In terms of MRNs, I will have to check about that.

Can the same person with different organizational affiliations be a part of two different teams?
I believe you can only be on one team, but I will check on that.

In the match fields listed, why is SSN listed twice? (once after Gender and once at the end)?
It should only be once.

You mentioned you can use a pseudo name for your team, will team members names be published or just team name?
On the Leaderboard, just the team name. The only time your real name will be in the public is if you win the challenge.

Will there be a data dictionary?


Other questions not answered verbally (some answered online)

Audio gave out at a critical time: If I find ID1 and ID4 linked, should I also submit ID4 and ID1 as linked?

What is the formula for the f-score, it went by too quickly?
Slides will be available on the website, and the formula is there in case we didn't get to your question during the webinar.

Is there a max number of true positive (match) records for any one patient?
There is not a max number of true positives imposed on the data set. Participants should what they believe are the correct matches.

Does HHS reserve the right to publicize just the winning teams' scores or might they also publicize details about the underlying algorithm?

Does each row of the test data contain an identity of a source system?

The confidence score is optional for submission. So will you be using that score in any of the evaluation? It sounds like you will not.
We will not.

Can the detailed "strict" guidelines for what constitutes a matching record be shared (in particular for those "ambiguous" cases)?

Can you say how the enterprise ID is different from the MRN?

Does the dataset include historical attributes or just the current attributes? Or both?

So it looks like SOUNDEX is ok to use in the process since it's public. If we run our matching methodology through a third party software with their own proprietary algorithm (similar to SOUNDEX), is that ok?
Yes. You can use any algorithim you would like, whether it is open or proprietary.

Wouldn't it be valuable to understand what algorithms best identify matches?

Will there be any challenge process to accommodate revised scores for patient matches found after the beta period?

If there are 3 records that match to one individual, do we separate the id's by a comma on one line.
No, in this case, matches would be presented one pair per line

Curiosity, how many teams are expected?

Will we know the highest f-score on a daily basis once the 1st submission is done?
Yes the leaderboard will be updated dynamically as submissions are scored.

How should matching records be identified (by all fields or is Enterprise ID enough to identify a match)?
Matches are identified as enterprise ID pairs, along with a confidence score for the matching pair. See the documentation for details.

Is Enterprise ID unique across all records in the data set?

Have Just Associates published in any form or in any detail the nature of their synthetic data creation methodology?

Once the challenge is complete, what are the next goals? Will the winners be published? Will there be more challenges in the future?

If a confidence score is provided with the results, will that be used in any way during the scoring process? If not, what's the value in providing it?

In the synthetic data set does it follow real world statistics for data duplications?

The submission mentioned a linkage between 2 identifiers. Can there be more than 2 linkages in the answers? e.g EID1 -> EID4 -> EID7 etc.
Yes, there can, but each line of a submission contains a single matching pair. So such groups of matches would have to be represented on separate lines of a submission.

Can we use algorithm + human decision on some "not so sure" cases to improve the performance?

How will you use the probability number within the link to determine the precision?

Can we use the data in future publications?

This may be more subjective... but is ONC's desire to have the winning submission have better match outcomes of the sample dataset than current leading algorithms would produce with the same data?