Skip to main content
  • Deep Learning Algorithms Applied to ECG Could Form New Type of Valvular Heart Disease Screening

    Machine learning analysis of 12-lead electrocardiography (ECG) could provide a new type of valvular heart disease (VHD) screening program, said researchers on a new study that found “deep learning” algorithms could accurately detect aortic stenosis (AS), aortic regurgitation (AR) and mitral regurgitation (MR).

    The study – which included an analysis of 77,156 patients undergoing digitally stored 12-lead ECG before an echocardiogram – was published Monday online, ahead of the Aug. 9 issue of the Journal of the American College of Cardiology.

    Despite well-recognized physical examination findings of VHD, a substantial portion of these conditions go undiagnosed, said the authors, led by Columbia University Irving Medical Center and New York-Presbyterian Hospital’s Pierre Elias, MD, and Timothy J. Poterucha, MD.

    They cited epidemiological estimates that moderate or severe VHD is present in up to 11% to 13% of those over the age of 75, of which one-half is undiagnosed.

    “Given the increasing prevalence of VHD and improvements in transcatheter, device, and surgical treatment options, early detection and treatment of VHD represents a pressing clinical need,” the researchers stressed, adding that widespread digitization of electronic health record (EHR) data over the last two decades makes deep learning feasible.  

    Deep learning refers to a class of machine learning algorithms that leverage multiple computation layers of neural networks, the researchers added, noting it has been “particularly successful when applied to complex medical data such as radiologic studies or [ECGs].”

    The current study, therefore, set out to develop and apply deep learning algorithms to large-scale data from patients 18 years or older who underwent digitally stored 12-lead ECG before an echocardiogram at Columbia University Irving Medical Center, Morgan Stanley Children’s Hospital of New York or New York-Presbyterian Allen Hospital from 2005-2021.

    The 77,156 patients were split up into the “train” set (43,165) – training the machine learning technology to recognize signs of the VHDs – a set for validation of the deep learning (12,950), and a test cohort (21,048; 7.8% of whom were diagnosed with AS, AR or MR), seeking out the conditions with the developed algorithms.

    For the validation and test sets, only the newest ECG per patient was included.

    The model was also tested on an independent ECG-echocardiographic data set at New York Presbyterian Lawrence Hospital. Patients in the three study cohorts were of similar mean age (train: 62.6 years; validation: 63.6 years; test: 61.5 years), though those in the New York Presbyterian Lawrence independent cohort were older, with a mean age of 70.7 years.

    The majority of patients were male in all groups (with 49.8%, 49.5%, 49.8% and 45.8%, respectively, female).

    Most patients had no/ trace AS (train: 83.1%; validation: 87.7%; test: 90.2%; and independent: 89.4%) and no/ train AR (73.6%; 81.3%; 87.9%; and 71.6%, respectively).

    In the train group, 40.1% of patients had no/trace MR while 38.3% had mild MR. In the other groups, the majority of patients had no/trace MR (validation: 59.1%; test: 73.2%; and independent: 52.5%).


    “When the model is applied to an ECG, the model output is a number that ranges from 0-1, with a value closer to 1 indicating a higher model confidence that the ECG is from a patient with moderate or severe VHD,” the researchers said.

    Model performance was assessed using area under the receiver-operating characteristic (AU-ROC) curve. Deep learning model accuracy was: AS (AU-ROC: 0.88; 95% confidence interval [CI]: 0.87-0.90), AR (AU-ROC: 0.77; 95%CI: 0.72-0.81), MR (AUROC: 0.83; 95% CI: 0.81-0.85).

    A composite of any of AS, AR, or MR AU-ROC was: 0.84 (sensitivity 78%, specificity 73%) with similar accuracy in external validation.

    The model was tested among subsets of sex, age, race, ethnicity, and QRS duration and morphology. It remained accurate in both sexes, but had decreasing accuracy across the lifespan, the researchers said.

    There was no difference in model accuracy by race for the composite of diseases, including no significant difference in performance in Hispanic vs non-Hispanic patients, and no significant difference between Black and white patients. However, the researchers did note a numeric trend toward worse performance in Black patients (Black patient AU-ROC: 0.78; 95% CI: 0.73-0.84; vs. White patient AU-ROC: 0.84; 95% CI: 0.82-0.86; vs. Hispanic AU-ROC: 0.82; 95% CI: 0.78-0.86).

    The composite of any of AS, AR, or MR model was also less accurate in patients with wider QRS duration. “Specifically, the presence of a left bundle branch block (LBBB) resulted in marked impairment in detection of AR and MR,” said the researchers.

    Using the test data, precision-recall curves were generated for each condition at disease prevalence levels ranging from 1% to 10% with test characteristics calculated at varying recall (sensitivity) levels, the researchers said.

    “At an AS, AR, or MR prevalence level of 5% and a sensitivity of 50%, the positive and negative predictive values were 22.8% and 97.2%, respectively. At a prevalence level of 10%, the positive predictive value increased to 36.7%, whereas at a prevalence of 1%, the positive predictive value decreased to 5.1%,” they added.

    Deep learning’s place in VHD diagnosis

    “Deep learning analysis of the 12-lead ECG can accurately detect moderate or severe AS, AR, and MR in this multiple center cohort,” the researchers concluded.

    “Based on the observed performance characteristics of our model, this approach may serve as the basis for the development of a valvular heart disease screening program.”

    They added that the use of deep learning in the analysis of ECG is already rapidly developing, citing other studies using similar deep learning model architecture to analyze ECG waveform data to detect AS, MR and AR.

    “Implementation studies are needed to assess the cost-effectiveness of deep learning technology to identify patients with structural heart disease who benefit from referral for longitudinal surveillance and/or intervention,” the authors continued.

    Wearables and portable devices

    In an accompanying editorial, Ambarish Pandey, MD, MSCS, from UT Southwestern Medical Center, Dalla, and Demilade Adedinsewo, MD, MPH, from Mayo Clinic, Jacksonville, Florida, added that digital health data are exponentially growing thanks to remote ECG capture devices such as implantable loop recorders.

    Wearables and other portable devices can “further augment the already promising role of [artificial intelligence]-based cardiovascular disease screening in the future,” they said, adding that: “Although using ECG to predict disease is a good start, there is no reason that data input should be limited to ECG in future models.”


    Elias P, Poterucha TJ, Rajaram V, et al. Deep Learning Electrocardiographic Analysis for Detection of Left-Sided Valvular Heart Disease. J Am Coll Cardiol 2022;80:613-626.

    Pandey A, Adedinsewo D. The Future of AI-Enhanced ECG Interpretation for Valvular Heart Disease Screening. J Am Coll Cardiol 2022;80:627-630.

    Image Credit: leowolfert –

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Review our Privacy Policy for more details