Although Apple, based on its internal research, claims that the Apple Watch (AW) ECG has a 98% sensitivity and a 99% specificity for detection of atrial fibrillation, doubts have been raised about its accuracy in the real world.
I have recently reported on Apple Watch’s inability to diagnose atrial fibrillation (AF) when the heart rate is >120 beats per minute. This inherent limitation means AW has a built-in reduced sensitivity (which was not present in the testing group.)
In a Research Letter published online Feb. 24th in Circulation, Dr. Marc Gillinov, reports on the accuracy of Apple Watch in a population of patients who were post cardiac surgery and therefore on cardiac telemetry with a high risk of going in and out of AF.
Rhythm assessments using the Apple Watch ECG were performed 3 times per day over 2 days on 50 patients. Comparison was made between the watch reading (Sinus rhythm, AF, or inconclusive) and an expert human interpretation of the PDF from the watch and simultaneously obtained telemetry rhythm strip.
The results were disappointing for the AW.
The AW4 notification correctly identified AF in 34 of 90 instances, yielding a sensitivity of 41%. Of 25 patients with at least 1 episode of AF, AF was identified in 19. Among patients in SR, none was designated as AF (ie, no false positives); however, rhythm was deemed inconclusive in 31% of patients, and there was no additional attempt to assess rhythm. Overall agreement between AW4 notification and telemetry was 61% (κ statistic = 0.33 [95% CI, 0.24–0.41]).
This confirms my prediction that AW would identify less than half of AF cases.
I have to believe that the 29 cases diagnosed as “inconclusive” were due to the AW AF inherent blinding limitation related to rapid heart rate. If we presume these would all have been correctly identified as AF (if the AW had not been hamstrung) then the sensitivity increases to 70%.
The authors of this article don’t seem to understand the difference between unreadable (meaning too much artifact to make a diagnosis) versus inconclusive (which Apple only uses when the AF is > 120 BPM.) They conclude by saying:
The unreadable (ie, inconclusive) rate reported in that study was 6% compared with 31% in this pilot study.
They have muddled together unreadable and inconclusive.
I do strongly agree with their final conclusions
Variations in sensitivity between these 2 studies suggest the need for further validation before this technology is adopted by the public for AF detection. Physicians should exercise caution before undertaking action based on electrocardiographic diagnoses generated by this wrist-worn monitor.
Indeed, any diagnosis from the Apple Watch itself should be confirmed by a cardiologist who is an expert at interpreting these single-lead ECG recordings.
The Apple Heart Study received great fanfare at least year’s AHA meetings and was subsequently published in the NEJM. Many Apple Watch (AW) wearers having heard of this study may have concluded the device will reliably identify atrial fibrillation (AF).
In my commentary on the Apple Heart Study I pointed out several issues with relying on Apple Watch for AF diagnosis, most significantly false positive notifications. Recent patient experiences have, in addition, made me concerned about false negative notifications and a lack of sensitivity.
AW ECG is inherently limited in diagnosing AF above 120 BPM. This guarantees a substantial number (possibly the majority) of AF episodes will not be recognized. Such false negative notifications may falsely reassure patients that they don’t have AF and delay them seeking medical attention.
Recently, I saw a patient who was referred to me for an abnormal 12-lead ECG. While reviewing his symptoms we discovered that his AW had registered high heart rates, sometimes up to 150 beats per minute, which lasted for several hours.
Although the AW had recorded this high heart rate it had not notified him of the possibility that he had atrial fibrillation or even that he had a high heart rate.
He had made the ECG recording below using the AW and the results came back inconclusive.
The AW ECG recording clearly shows atrial fibrillation going at a rapid rate-over 150 beats per minute-but the accompanying interpretation gives no hint that the patient had AF.
Based on the combination of an absence of any irregular heart rate/AF warnings from his AW and the absence of a diagnosis of AF when he made AW ECG recordings of the fast rates the patient assumed that he did not have atrial fibrillation.
Why is this? Apparently Apple has decided not to check for AF if the heart rate is over 120 BPM.
Given that most patients with new-onset AF will have heart rates over 120 BPM (assuming they are not on a rate slowing drug like a beta-blocker) it appears likely that Apple Watch ECG will fail to diagnose most cases of AF.
I asked my patient to record an ECG with his watch every time he felt his heart racing after our office visit. A few days later he was sitting in an easy chair after Thanksgiving watching TV and had another spell of racing heart. This time the heart rate was less than 120 BPM and the AW was able to analyze and make the diagnosis.
The inability of AW ECG to diagnose AF when the rate is >120 BPM further adds to my concerns about widespread unsupervised use of the device. When we combine inconclusive high heart rate analyses with the unknown sensitivity of the irregular heartbeat notification algorithm the AW may be providing many patients who have atrial fibrillation with a false sense of security.
The skeptical cardiologist has been testing the comparative accuracy of two hand-held mobile ECG devices in his office over the last month. I’ve written extensively about my experience with the AliveCor/Kardia (ACK) device here and here. Most recently I described my experience with the Afib Alert (AA) device here.
Over several days I had my office patients utilize both devices to record their cardiac rhythm and I compared the device diagnosis to the patient’s true cardiac rhythm.
In 14 patients both devices correctly identified normal sinus rhythm. AFA does this by displaying a green check mark , ACK by displaying the actual recording on a smartphone screen along with the word Normal.
The AFA ECG can subsequently uploaded via USB connection to a PC and reviewed in PDF format. The ACK PDF can be viewed instantaneously and saved or emailed as PDF.
Normal by AFA/Unreadable or Unclassified by AliveCor
In 5 patients in normal rhythm (NSR) , AFA correctly identified the rhythm but ACK was either unreadable (3) or unclassified (2). In the not infrequent case of a poor ACK tracing I will spend extra time adjusting the patient’s hand position on the electrodes or stabilizing the hands. With AFA this is rarely necessary.
In this 70 year old man the AFA device recording was very good and the device immediately identified the rhythm as normal.
ACK recording was good quality but its algorithm could not classify the rhythm.
A 68 year old man who had had bypass surgery and aortic valve replacement had a very good quality AFA recording with correct classification as NSR
AliveCor/Kardia recordings on the same patient despite considerable and prolonged efforts to improve the recording were poor and were classified as “unreadable”
There were 3 cases were AFA diagnosed atrial fibrillation (AF) and the rhythm was not AF. These are considered false positives and can lead to unncessary concern when the device is being used by patients at home. In 2 of these ACK was unreadable or unclassified and in one ACK also diagnosed AF.
A 90 year old woman with right bundle branch block (RBBBin NSR was classified by AFA as being in AF.
The ACK algorithm is clearly more conservative than AA. The ACK manual states:
If you have been diagnosed with a condition that affects the shape of your EKG (e.g., intraventricular conduction delay, left or right bundle branch block,Wolff-Parkinson-White Syndrome, etc.), experience a large number of premature ventricular or atrial contractions (PVC and PAC), are experiencing an arrhythmia, or took a poor quality recording it is unlikely that you will be notified that your EKG is normal.
One man’s rhythm confounded both AFA and AC. This gentleman has had atrial flutter in the past and records at home his rhythm daily using his own AliveCor device which he uses in conjunction with an iPad.
During our office visits we review the recordings he has made. He was quite bothered by the fact that he had several that were identified by Alivecor as AF but in fact were normal.
A recording he made on May 2nd at 845 pm was read as unclassified but with a heart rate of 149 BPM. The rhythm is actually atrial flutter with 2:1 block.
Sure enough, when I recorded his rhythm with ACK although NSR (with APCS) it was read as unclassified
AFA classified Lawrence’s rhythm as AF when it was in fact normal sinus with APCs.
One patient a 50 year old woman who has a chronic sinus tachycardia and typically has a heart rate in the 130s, both devices failed.
We could have anticipated that AC would make her unclassified due to a HR over 100 worse than unclassified the tracing obtained on her by AC (on the right)was terrible and unreadable until the last few seconds. On the other hand the AFA tracing was rock solid throughout and clearly shows p waves and a regular tachycardia. For unclear reasons, however the AFA device diagnosed this as AF.
Accuracy in Patients In Atrial Fibrillation
In 2/4 patients with AF, both devices correctly classified the rhythm..
In one patient AFA correctly diagnosed AF whereas ACK called it unclassified.
This patient was in afib with HR over 100. AFA correctly identified it whereas ACK called in unclassified. The AC was noisy in the beginning but towards the end one can clearly diagnose AF
In one 90 year old man AFA could not make the diagnosis (yellow)
ACK correctly identified the rhythm as AF
One patient who I had recently cardioverted from AF was the only false positive ACK. AliveCor tracing is poor quality and was called AF whereas AFA correctly identified NSR>
The sensitivity of both devices for detecting atrial fibrillation was 75%.
The specificity of AFA was 86% and that of ACK was 88%.
ACK was unreadable or unclassified 5/26 times or 19% of the time.
The sensitivity and specificity I’m reporting is less than reported in other studies but I think it represents more real world experience with these types of devices.
In a head to head comparison of AFA and ACK mobile ECG devices I found
-Recordings using AfibAlert are usually superior in quality to AliveCor tracings with a minimum of need for adjustment of hand position and instruction.
-This superiority of ease of use and quality mean almost all AfibAlert tracings are interpreted whereas 19% of AliveCor tracings are either unclassified or unreadable.
-Sensitivity is similar. Both devices are highly likely to properly detect and identify atrial fibrillation when it occurs.
-AliveCor specificity is superior to AfibAlert. This means less cases that are not AF will be classified as AF by AliveCor compared to AfibAlert. This is due to a more conservative algorithm in AliveCor which rejects wide QRS complexes, frequent extra-systoles.
Both companies are actively tweaking their algorithms and software to improve real world accuracy and improve user experience but what I report reflects what a patient at home or a physician in office can reasonably expect from these devices right now.