November 11, 2020: Our paper describing this year’s Challenge is now available at Physiological Measurement as part of the focus issue on multilead ECG classification. Please cite this paper to describe the Challenge, consider submitting your work to the focus issue, and stay tuned for the launch of next year’s Challenge!
September 28, 2020: Our paper about the Challenge is under review. Please cite Perez Alday EA, Gu A, Shah AJ, Robichaux C, Wong AI, Liu C, Liu F, Rad AB, Elola A, Seyedi S, Li Q, Sharma A, Clifford GD*, Reyna MA*. Classification of 12-lead ECGs: the PhysioNet/Computing in Cardiology Challenge 2020. Physiol Meas. 41 (2020). doi: 10.1088/1361-6579/abc960 to refer to this year’s Challenge.
September 24, 2020: The focus issue in Physiological Measurement is now open for submissions! To submit, create an account and choose “Special Issue Article” and then “Classification of Multilead ECGs”. Please do not use the phrases “PhysioNet Challenge”, “Computing in Cardiology”, or “classification of multilead ECGs” in your title, which should be specific to your contributions.
September 21, 2020: The winners of the 2020 Challenge were announced at CinC in Rimini, Italy on September 16. From 1395 entries (707 successful) by 217 teams, 110 abstracts were accepted for presentation and 41 teams were officially ranked. Results can be found here.
August 24, 2020: The official phase of the Challenge is now over. We will contact teams scores for recent entries over the next few days, and we encourage teams to choose their favorite entry for evaluation on the full test set. Please see the full announcement on the Challenge forum for details, including important information about preparing for CinC.
July 1, 2020: We are now accepting entries for the official phase of the Challenge. For the first time, we are requiring teams to submit their pretrained models and code for training their models. See the full announcement on the Challenge forum and the previous two announcements for details.
June 8, 2020: We are releasing 4 new tranches of 12-lead ECGs with SNOMED-CT labels to complement the 2 previously released tranches. Altogether, 6 databases with 43,101 labeled recordings are now available. We will reopen the scoring system and release an updated scoring metric in the coming days. See the full on the Challenge forum for details.
June 3, 2020: All abstract acceptances and rejections have been announced. Please check the Google Group announcement for more details.
May 26, 2020: Abstract reviews are now complete and will be announced within the next week. Please see the updated key dates/deadlines and details on the wild card entries below. See the full announcement on the Challenge Google Group here.
May 12, 2020: Your abstracts are under review, and we hope to release acceptances and rejections by early June. For those who missed the abstract deadline, we will provide an opportunity to qualify as a wild card participant over the summer, so please don’t give up!
May 11, 2020: We have begun the official phase of the Challenge. Please find a new tranche of data posted here with SNOMED-CT codes as diagnoses. Please note that there are some errors or debatable labels in some of the data. Part of the Challenge will be working out how to deal with these issues. In the next few weeks, we will release more data and reopen the scoring system with a new scoring metric.
April 25, 2020: We have reopened the submission system for 5 more days (until 30 April 2020 at 23:59 GMT) to help teams who are able to submit bug-free entries qualify for the Challenge.
March 31, 2020: The CinC abstract deadline has been extended to May 1.
March 16, 2020: The leaderboard is now live.
February 7, 2020: The 2020 Challenge is now open! See below for details.
The electrocardiogram (ECG) is a non-invasive representation of the electrical activity of the heart from electrodes placed on the surface of the torso. The standard 12-lead ECG has been widely used to diagnose a variety of cardiac abnormalities such as cardiac arrhythmias, and predicts cardiovascular morbidity and mortality . The early and correct diagnosis of cardiac abnormalities can increase the chances of successful treatments . However, manual interpretation of the electrocardiogram is time-consuming, and requires skilled personnel with a high degree of training .
Automatic detection and classification of cardiac abnormalities can assist physicians in the diagnosis of the growing number of ECGs recorded. Over the last decade, there have been increasing numbers of attempts to stimulate 12-lead ECG classification. Many of these algorithms seem to have the potential for accurate identification of cardiac abnormalities. However, most of these methods have only been tested or developed in single, small, or relatively homogeneous datasets. The PhysioNet/Computing in Cardiology Challenge 2020 provides an opportunity to address this problem by providing data from a wide set of sources.
The goal of the 2020 Challenge is to identify clinical diagnoses from 12-lead ECG recordings.
We ask participants to design and implement a working, open-source algorithm that can, based only on the clinical data provided, automatically identify the cardiac abnormality or abnormalities present in each 12-lead ECG recording. The winners of the Challenge will be the team whose algorithm achieves the highest score for recordings in the hidden test set.
The data for this Challenge are from multiple sources:
The first source is the public (CPSC Database) and unused data (CPSC-Extra Database) from the China Physiological Signal Challenge in 2018 (CPSC2018), held during the 7th International Conference on Biomedical Engineering and Biotechnology in Nanjing, China. The unused data from the CPSC2018 is NOT the test data from the CPSC2018. The test data of the CPSC2018 is included in the final private database that has been sequestered. This training set consists of two sets of 6,877 (male: 3,699; female: 3,178) and 3,453 (male: 1,843; female: 1,610) of 12-ECG recordings lasting from 6 seconds to 60 seconds. Each recording was sampled at 500 Hz.
The second source set is the public dataset from St Petersburg INCART 12-lead Arrhythmia Database. This database consists of 75 annotated recordings extracted from 32 Holter records. Each record is 30 minutes long and contains 12 standard leads, each sampled at 257 Hz.
The third source from the Physikalisch Technische Bundesanstalt (PTB) comprises two public databases: the PTB Diagnostic ECG Database and the PTB-XL, a large publicly available electrocardiography dataset. The first PTB database contains 549 records (male: 377, female: 139). Each recording was sampled at 1000 Hz. The PTB-XL contains 21,837 clinical 12-lead ECGs (male: 11,379 and female: 10,458) of 10 second length with a sampling frequency of 500 Hz.
The fourth source is a Georgia database which represents a unique demographic of the Southeastern United States. This training set contains 10,344 12-lead ECGs (male: 5,551, female: 4,793) of 10 second length with a sampling frequency of 500 Hz.
The fifth source is an undisclosed American database that is geographically distinct from the Georgia database. This source contains 10,000 ECGs (all retained as test data).
All data is provided in WFDB format. Each ECG recording has a binary MATLAB v4 file (see page 27) for the ECG signal data and a text file in WFDB header format describing the recording and patient attributes, including the diagnosis (the labels for the recording). The binary files can be read using the load function in MATLAB and the scipy.io.loadmat function in Python; please see our baseline models for examples of loading the data. The first line of the header provides information about the total number of leads and the total number of samples or points per lead. The following lines describe how each lead was saved, and the last lines provide information on demographics and diagnosis. Below is an example header file
A0001 12 500 7500 05-Feb-2020 11:39:16 A0001.mat 16+24 1000/mV 16 0 28 -1716 0 I A0001.mat 16+24 1000/mV 16 0 7 2029 0 II A0001.mat 16+24 1000/mV 16 0 -21 3745 0 III A0001.mat 16+24 1000/mV 16 0 -17 3680 0 aVR A0001.mat 16+24 1000/mV 16 0 24 -2664 0 aVL A0001.mat 16+24 1000/mV 16 0 -7 -1499 0 aVF A0001.mat 16+24 1000/mV 16 0 -290 390 0 V1 A0001.mat 16+24 1000/mV 16 0 -204 157 0 V2 A0001.mat 16+24 1000/mV 16 0 -96 -2555 0 V3 A0001.mat 16+24 1000/mV 16 0 -112 49 0 V4 A0001.mat 16+24 1000/mV 16 0 -596 -321 0 V5 A0001.mat 16+24 1000/mV 16 0 -16 -3112 0 V6 #Age: 74 #Sex: Male #Dx: 426783006 #Rx: Unknown #Hx: Unknown #Sx: Unknown
From the first line, we see that the recording number is A0001, and the recording file is
A0001.mat. The recording has 12 leads, each recorded at 500 Hz sample frequency, and contains 7500 samples. From the next 12 lines, we see that each signal was written at 16 bits with an offset of 24 bits, the amplitude resolution is 1000 with units in mV, the resolution of the analog-to-digital converter (ADC) used to digitize the signal is 16 bits, and the baseline value corresponding to 0 physical units is 0. The first value of the signal, the checksum, and the lead name are included for each signal. From the final 6 lines, we see that the patient is a 74-year-old male with a diagnosis (Dx) of 426783006. The medical prescription (Rx), history (Hx), and symptom or surgery (Sx) are unknown.
Each ECG recording has one or more labels from different type of abnormalities in SNOMED-CT codes. The full list of diagnoses for the challenge has been posted here as a 3 column CSV file: Long-form description, corresponding SNOMED-CT code, abbreviation. Although these descriptions apply to all training data there may be fewer classes in the test data, and in different proportions. However, every class in the test data will be represented in the training data.
The training data can be downloaded from these links. You can use the MD5 hash to verify the integrity of the
If you are unable to use these links to access the data, or if you want to use a command-line tool to access the data through Google Colab, then you can download the training data with these commands:
wget -O PhysioNetChallenge2020_Training_CPSC.tar.gz \ https://cloudypipeline.com:9555/api/download/physionet2020training/PhysioNetChallenge2020_Training_CPSC.tar.gz/ wget -O PhysioNetChallenge2020_Training_2.tar.gz \ https://cloudypipeline.com:9555/api/download/physionet2020training/PhysioNetChallenge2020_Training_2.tar.gz/ wget -O PhysioNetChallenge2020_Training_StPetersburg.tar.gz \ https://cloudypipeline.com:9555/api/download/physionet2020training/PhysioNetChallenge2020_Training_StPetersburg.tar.gz/ wget -O PhysioNetChallenge2020_Training_PTB.tar.gz \ https://cloudypipeline.com:9555/api/download/physionet2020training/PhysioNetChallenge2020_Training_PTB.tar.gz/ wget -O PhysioNetChallenge2020_Training_PTB-XL.tar.gz \ https://cloudypipeline.com:9555/api/download/physionet2020training/PhysioNetChallenge2020_PTB-XL.tar.gz/ wget -O PhysioNetChallenge2020_Training_E.tar.gz \ https://cloudypipeline.com:9555/api/download/physionet2020training/PhysioNetChallenge2020_Training_E.tar.gz/
The test set comprises data from the same sources as some of the training sets as well as one entire new set recorded from a geographically distinct institution from the training. Therefore, while there may be a small number of ECGs from patients that are in both training and test data, there is at least one test database in which the likelihood of any patients in the training database being represented in the test data is vanishingly small (but not zero).
We are not planning to release the test data at any point, including after the end of the Challenge. Requests for the test data will not receive a response. We do not release test data to prevent overfitting on the test data and claims or publications of inflated performances. We will entertain requests to run code on the test data after the Challenge on a limited basis based on publication necessity and capacity. (The Challenge is largely staged by volunteers.)
Please note that there are bound to be some errors or debatable labels in each database. Although we have updated some of the data and labels from the unofficial period of the Challenge, many errors will persist. Part of the Challenge is to work out how to deal with these issues. Some databases have human overread machine labels, and some have single or multiple human labels, so the quality will vary, as well as the demographics and diagnoses. There will also be no more updates to the training data from this point onwards.
To participate in the Challenge, you must register here, providing the full names, affiliations and official email addresses of your entire team. The details of all authors must be exactly the same as the details you use to submit your abstract to Computing in Cardiology. You may add (but not subtract) authors later by emailing challenge [at] physionet.org.
For each 12-lead ECG recording, your algorithm must identify a set of one or more classes as well as a probability or confidence score for each class. For example, suppose that your classifier identifies atrial fibrillation (164889003) and a first-degree atrioventricular block (270492004) with probabilities of 90% and 60%, respectively, for a particular 12-lead ECG sample, but it does not identify any other rhythm types. Your code might produce the following output for a single recording (not for each lead):
#Record ID 164889003, 270492004, 164909002, 426783006, 59118001, 284470004, 164884008, 429622005, 164931005 1, 1, 0, 0, 0, 0, 0, 0, 0 0.9, 0.6, 0.2, 0.05, 0.2, 0.35, 0.35, 0.1, 0.1
The baseline classifiers are simple logistic regression models. The Python classifier uses statistical moments of RR intervals computed using Python Online and Offline ECG QRS Detector based on the Pan-Tomkins algorithm and demographic data taken directly from the WFDB header file (the
.hea file) as predictors The Matlab classifier uses the PhysioNet Cardiovascular Signal Toolbox and ECGKit to compute global electrical heterogeneity (GEH) from XYZ median beats and demographic data taken directly from the WFDB header file (the
.hea file) as predictors.
Please use the code for these baseline models as a template for your submissions. Please see the submissions instructions for detailed information about how to submit a successful Challenge entry and submit here when ready. We will provide feedback on your entry as soon as possible, so please wait at least 72 hours before contacting us about the status of your entry.
For the first time in any public competition, we will require code both for your trained model and for training your model. If we cannot reproduce your model from the training code, then you will not be eligible for ranking or a prize. Submissions of training code will begin during the official phase of the Challenge.
For this year’s Challenge, we developed a new scoring metric that awards partial credit to misdiagnoses that result in similar treatments or outcomes as the true diagnosis as judged by our cardiologists. This scoring metric reflects the clinical reality that some misdiagnoses are more harmful than others and should be scored accordingly. Moreover, it reflects the fact that confusing some classes is much less harmful that confusing other classes. It is defined as follows:
Let C = [ci] be a collection of diagnoses. We compute a multi-class confusion matrix A = [aij], where aij is the number of recordings in a database that were classified as belonging to class ci but actually belong to class cj. We assign different weights W = [wij] to different entries in this matrix based on the similarity of treatments or differences in risks. The score s is given by s = Σij wij aij, which is a generalized version of the traditional accuracy metric. The score s is then normalized so that a classifier that always outputs the true class(es) receives a score of 1 and an inactive classifier that always outputs the normal class receives a score of 0.
The scoring metric is designed to award full credit to correct diagnoses and partial credit to misdiagnoses with similar risks or outcomes as the true diagnosis. Therefore, true positives are rewarded, false negatives are partially rewarded, and false positives are effectively penalized by receiving no credit at all, or, equivalently, by reducing the credit for true positives and false negatives. (True negatives are technically neither rewarded nor penalized by this metric.) A classifier that returns only positive outputs should now receive a negative score, i.e., a lower score than a classifier that returns only negative outputs.
See the leaderboard for the current scores from the Challenge.
There are two phases for the Challenge: an unofficial phase and an official phase. The unofficial phase of the Challenge allows us to introduce and “beta test” the data, scores, and submission system before the official phase of the Challenge. Participation in the unofficial phase is mandatory for participating in the official phase of the Challenge because it helps us to improve the official phase.
Entrants may have an overall total of up to 15 scored entries over both the unofficial and official phases of the competition (see Table). All deadlines occur at 11:59pm GMT (UTC) on the dates mentioned below, and all dates are during 2020 unless indicated otherwise. If you do not know the difference between GMT and your local time, then find out what it is before the deadline! Please do not wait until the deadline to submit your entries because you will be unable to resubmit them if there are unexpected errors or issues with your submissions.
|Unofficial phase||7 February 2020||30 April 2020||1-5 scored entries (*)|
|Hiatus||1 May 2020||10 May 2020||N/A|
|Abstract deadline||1 May 2020||1 May 2020||1 abstract|
|Official phase||11 May 2020||23 August 2020||1-10 scored entries (*)|
|Challengers notified of abstract accept/reject decisions||1 June 2020||3 June 2020||N/A|
|Wild card deadline||28 July 2020||28 July 2020||1-10 scored entries (*)|
|Wild card eligibility notification||29 July 2020||29 July 2020||N/A|
|Wild card abstract submission deadline||5 August 2020||5 August 2020||1 abstract|
|Hiatus||24 August 2020||12 September 2020||N/A|
|Preprint deadline||6 September 2020||6 September 2020||One 4-page paper (**)|
|Conference||13 September 2020||16 September 2020||One 4-page paper (***)|
|Final scores released||17 September 2020||24 September 2020||N/A|
|Final paper submitted||25 September 2020||30 September 2020||One 4-page paper (***)|
(* Entries that fail to score do not count against limits.)
(** Must includes preliminary scores.)
(*** Must include final score, your ranking in the Challenge, and any updates to your work as a result of feedback after presenting at CinC.)
To be eligible for the open-source award, you must do all the following:
If your abstract is rejected, or you failed to qualify during the unofficial period, then there is still a chance to win the Challenge. Two “wild card” entries have been reserved for you in the conference program (poster presentaitons only) for this purpose. On the 28 July (23:59 UTC) we will select two top scored entries, using the official scoring metric at the time, and offer the two teams the opportunity to submit an abstract directly to the Challenge organizers. The abstract will still be reviewed as thoroughly as any other abstract accepted for the conference and the team must submit an abstract of an acceptable standard. See Advice on Writing an Abstract.
To improve your chances of having your abstract accepted, we offer the following advice. Make sure all the authors match your registration information and you use the same email addresses. Stick to the word limit (check the conference website for updates, but it is usually between 250 and 300 words). Make sure all your co-authors agree on the abstract. Importantly, be sure to submit your abstract by the deadline, so include time for errors, internet outages, etc. When submitting, you will be asked for the topic – please select “PhysioNet/CinC Challenge” so it can be identified easily by the abstract review committee. However, do not include the words “PhysioNet” or “PhysioNet/CinC” or “Challenge” in the title – this creates confusion with the hundreds of other articles and the main descriptor of the Challenge. Although your work is bound to change, the quality of your abstract is a good indicator of the final quality of your work. We suggest you spell check, write in full sentences, and be specific about your approaches. Include cross validated training performance (using the Challenge metrics) and your score provided by the Challenge submission system. If you omit or inflate this latter score, then your abstract will be rejected. If you are unable to get the scoring system working, then you can still submit, but the work should be very high quality. Your title, abstract and author list (collaborators) can be modified in September when you submit the final paper, so do not be embarrassed by any low scores. We do not expect high scores at this stage. We are focused on the thoughtfulness of the approach and quality of the abstract.
You will be notified if your abstract has been accepted by email from CinC in June. You may not enter more than one abstract describing your work in the Challenge. We know you may have multiple ideas, and the actual abstract will evolve over the course of the Challenge – this is OK. More information, particularly on discounts and scholarships, can be found here. We are sorry, but the Challenge Organizers do not have extra funds to enable discounts or funding to attend the conference.
Given the extended deadline for the unofficial phase of the Challenge, we would like to emphasize the following points.
We cannot guarantee that your code will be run in time for the cinc.org abstract deadline, especially if you submit your code immediately before the deadline.
It is much more important to focus on writing a high-quality abstract describing your work and submit this to the conference by abstract deadline. Please follow these instructions here carefully.
Please make sure that . If you need to add or subtract authors, do this at least a week before the abstract deadline (i.e., now). Asking us to alter team membership near or after the deadline is going to lead to confusion that could affect your score during review. It is better to be more inclusive on the abstract in terms of authorship though - if we find authors have moved between abstracts/teams without permission, this is likely to lead to disqualification. As noted above, you may change the authors/team members later in the Challenge.
Please make sure that you include , your official score as it appears on the leaderboard, and in your abstract for this year’s challenge (especially if you are unable to receive a score or are scoring poorly). Your score will not affect acceptance. It is the novelty of your approach and the rigor of your research that matters at this point. Please make sure you describe your technique and any novelty very specifically. General statements such as ‘a 1D CNN was used’ are uninformative and will score poorly in review.
The organizers of the Challenge have no ability to help with any problems with the abstract submission system. We do not operate it. Please do not email us with issues related to the abstract submission system.
We encourage the use of open-source licenses for your entries.
Entries with non open-source licenses will be scored but not ranked in the official competition. All scores will be made public. At the end of the competition, all entries will be posted publicly, and therefore automatically mirrored on several sites around the world. We have no control over these sites, so we cannot remove your code even on request. Before the end of the competition, your code is not publicly available, and you can choose to withdraw it until the end of the Challenge in August. However, the Organizers reserve the right to retain and use a copy of the code for non-commercial use. This allows us to re-score if definitions change and validate any claims made by competitors.
If no license is specified by the participant, the organizers will assume the license is BSD 3.0.
To maintain the scientific impact of the Challenges, it is important that all Challengers contribute truly independent ideas. For this reason, we impose the following rules on team composition/collaboration:
If we discover evidence of the contravention of these rules, then you will be ineligible for a prize and your entry publicly marked as possibly associated with another entry. Although we will contact the team(s) in question, time and resources are limited and the Organizers must use their best judgement on the matter in a short period of time. The Organizers’ decision on rule violations will be final.
CinC 2020 will take place from 13-16 September 2020 in Rimini, Italy. You must attend the whole conference to be eligible for prizes. For 2020, remote attendance is acceptable. If you send someone in your place who is not a team member or co-author you will be disqualified and your abstract will be removed from the proceedings. It is vital the presenter (oral or poster) can defend your work in person, and has an in-depth knowledge of all decisions made during the development of your algorithm. No exceptions will be made. No remote attendance will be allowed because the organizational burden for this is too high. If you require a visa to attend the conference, we strongly suggest that you apply as soon as possible. Please contact the local conference organizing committee (not the Challenge Organizers) for any visa sponsorship letters and answer any questions concerning the conference.
Due to the uncertainties around travel, we have unfortunately decided not to run the Hackathon this year.
This year’s Challenge is generously co-sponsored by Google, MathWorks, and the Gordon and Betty Moore Foundation.
MathWorks has generously decided to sponsor this Challenge by providing complimentary licenses to all teams that wish to use MATLAB. Users can apply for a license and learn more about MATLAB support by visiting the PhysioNet Challenge page from MathWorks. If you have questions or need technical support, then please contact MathWorks at email@example.com.
Google has generously agreed to provide Google Cloud Platform (GCP) credits for up to 40 teams for this Challenge. We will award these to the top performing teams each month. These credits should provide an added incentive to submit more entries earlier on, and give teams the maximum opportunity to learn before spending money in the cloud.
At the time of launching this Challenge, Google Cloud offers multiple services for free on a one-year trial basis and $300 in cloud credits. Additionally, if teams are based at an educational institution in selected countries, then they can access free GCP training online.
Google Cloud credits of $500 per team will be made available to teams (that requested credits when registering for the Challenge) with both a successful entry to the official phase of the Challenge and an accepted abstract to CinC. Only one credit of exactly $500 will be provided to one email address associated with each team. An upper limit of $20,000 in credits will initially be made available to teams based on Challenge scores.
The Challenge Organizers, their employers, PhysioNet and Computing in Cardiology accept no responsibility for the loss of credits, or failure to issue credits for any reason. Please note, by requesting credits, you are granting us permission to forward your details to Google for the distribution of credits. You can register for these credits during the Challenge registration process.
This year’s Challenge is generously co-sponsored by Google, MathWorks, and the Gordon and Betty Moore Foundation.
Supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant R01EB030362.