Start · Catalogue · Profile · Table
Sleep BODY HANDBOOK
Sleep · §196
Sleep Tracker Accuracy
Your sleep tracker knows roughly when you slept and roughly for how long. It does not know how much deep sleep or REM you got — those numbers are educated guesses with error bars wider than most people's night-to-night swings. The data is useful as a long-run trend mirror and, on newer Apple and Samsung watches, as a screen for sleep apnea — a condition most people who have it don't know they have. But treated as a daily report card, it can also do the thing nobody buys it for: make sleep worse.
Know · As-needed Evidence Moderate Chapter Sleep

The wrist or ring on your hand has no brain on it — it's reading movement, pulse, and a few autonomic side-effects, then guessing at sleep architecture. It gets total sleep time about right. Sleep stages, especially deep sleep, it doesn't. The biggest real win sitting in this category is the FDA-cleared sleep-apnea notification on the latest Apple Watch; the biggest real risk is reading the score as a verdict on a night that felt fine.

Polysomnography — the sleep-lab gold standard — sticks electrodes to your scalp to read your brain's electrical activity, electrodes near your eyes to catch the eye movements that mark REM, electrodes under your chin to check muscle tone, plus heart-rate and breathing channels. Stage names like deep sleep and REM are defined on what those traces show every 30 seconds. Consumer wearables have none of that.

A watch or ring sees your wrist or finger moving (or not), and reads your pulse through the skin with a green light. From those two signals — plus skin temperature and blood-oxygen on some devices — a proprietary algorithm guesses at what an EEG would have shown de Zambotti et al. 2024. The pulse signal does carry information about sleep stages, because your nervous system shifts the heart's beat-to-beat rhythm across light, deep, and REM sleep Altini & Kinnunen 2021. But that rhythm is a downstream echo, not the staging substrate itself. The device is reading the side effects of sleep on the body, not sleep itself — that's the whole reason the numbers are imperfect.

What it actually gets right

Sleep versus wake — the basic "were you asleep at all" question — modern wearables nail. The top rings and watches clear 95% sensitivity for sleep on healthy adults Chee et al. 2024. They lean toward calling quiet wakefulness "sleep," so total time runs slightly long on individual readings; pooled across two dozen validation studies, the average bias is the other way — about 17 minutes less than the lab measures Lee et al. 2024. Either way, you can trust the rough number.

Stage breakdown is where the picture gets ugly.

Different devices fail in different directions. The original Oura ring underestimated deep sleep by about 20 minutes per night and overestimated REM by 17 de Zambotti et al. 2019; the Fitbit Charge 4, tested in chronic-insomnia patients, underestimated deep sleep by 41 minutes Liang et al. 2022. The newest Oura algorithm, validated across more than 420,000 sleep epochs in healthy adults, is genuinely closer — above 90% sensitivity for sleep stages Svensson et al. 2024 — but that's the best published case, and it's a ring on healthy people sleeping normally.

For heart-rate variability measured overnight, the news is better. Validated against a medical chest-strap ECG across 536 nights, the latest Oura ring landed within 6% of the medical reference; Whoop within 8%; Garmin and Polar substantially worse Dial et al. 2025. The trend you see night by night on a good device is roughly what a clinical sensor would say.

What people read into the numbers that isn't there

The "deep sleep" line on the dashboard looks like a measurement. It's a guess with a 20-to-40-minute error bar per night. A drop from 1h45m to 55 minutes from Tuesday to Wednesday might mean nothing changed — the noise floor of the device swallows most of the variance you'll see week to week de Zambotti et al. 2019 Liang et al. 2022 Cho et al. 2023.

Sleep scores — Oura Readiness, Whoop Recovery, Fitbit Sleep Score — are proprietary blends of duration, stage guesses, heart rate, HRV, and movement. The weights are not published. None has been validated in a peer-reviewed trial against next-day cognition, mood, or athletic performance Goldstein et al. 2021. Treat the score as a personal trend signal, not a verdict on the night.

And the move that brought "orthosomnia" into the medical vocabulary: trusting the device over your own felt experience. In the original case series, patients came to sleep clinics convinced they barely slept. Sent for an actual lab study, the recordings showed normal sleep — and several patients continued to believe the tracker Baron et al. 2017. The wearable's specificity for wake is the weak point: when you're lying still awake it routinely calls it sleep, and the reverse happens too. Your felt experience is not noise.

What happens if you start chasing the numbers

At first it's interesting. You see your nights laid out for the first time and you make a few small changes — earlier bedtime, less wine on a weekday — and the score moves up. Real wins.

Then a bad number lands on a night that felt fine, and the question shifts from "how do I feel?" to "what did the device say?" A week in, you're spending an extra forty minutes in bed trying to hit a sleep-duration target. The number doesn't move; you lie there awake, which the device also doesn't see clearly. The bed starts to feel like somewhere you go to perform.

This is the failure mode behavioral sleep medicine named in 2017 — orthosomnia, the unhealthy pursuit of perfect sleep, first described in patients who trusted their wearables over their clinicians Baron et al. 2017. It looks like classic insomnia from the outside, because extra time in bed chasing better sleep is exactly what produces insomnia. The first general-population estimate puts prevalence at somewhere between 3% and 14% of tracker users, concentrated in people already trending anxious about sleep Jahrami et al. 2024.

Six months in, the social signal: friends ask why you keep talking about your sleep score. A year in: the score is what you check before you check whether you feel rested. The people who land badly here end up with a worse relationship with sleep than before they put the device on. The people who land well are barely glancing at it.

What you actually get out of one, when it works

In the first week: a calibration. You find out your average is forty minutes shorter than you thought, you notice you go to bed an hour later than you tell yourself. Tracking is a mirror — most of the value is in being seen.

Over the first month, if the numbers nudge a real change — moving bedtime earlier, cutting the second evening drink, holding a consistent wake time — modest gains land. The one randomized trial on wearable use itself (Whoop, one week, healthy adults) showed improved subjective sleep quality with the feedback turned on Berryhill et al. 2020. Better-rested days look like sharper meetings and fewer mid-afternoon crashes — the standard payoff of consistent sleep, not a wearable-specific gift. Don't expect dramatic. Expect the small, real version of consistent.

Over months, the trend view matters more than any single night. Apple Watch with the FDA-cleared apnea-notification feature checks 30 nights of breathing patterns and tells you if you're showing consistent signs of moderate-to-severe sleep apnea — the algorithm was cleared on a 1,448-person trial Apple 2024. Roughly 30 million American adults have sleep apnea and most don't know it; for the fraction of wearers who get a notification and book the follow-up, the real payoff isn't a better sleep score. It's not having a heart attack at 55.

Years in, used loosely, a tracker becomes background information — checked on bad weeks, ignored on good ones, occasionally surfacing a pattern (jet-lag recovery time, training-load effect on HRV) that explains something you were already feeling. Useful, modest, durable. That's the right relationship.

How to use one without making things worse

Three rules cover most of it. Trust the long-run trend, ignore the single night. Trust the duration and timing numbers; ignore the deep-sleep and REM breakdown. Take any apnea notification seriously.

Cost varies. Mid-range watches and rings sit around $150–$400 up front; subscription devices like Whoop ($239/year) and Oura's premium tier (about $72/year on top of the ring) add ongoing cost. The work itself is small — wear it, charge it. Battery life runs from about a day on Apple Watch to about a week on Oura and Whoop, so devices that need nightly charging will miss more nights than ones that don't.

Who actually benefits, who should skip

Healthy adults curious about their patterns, athletes adjusting training around recovery trends, people whose partner has noticed snoring and gasping, shift workers who want objective data on their disrupted schedules — these are the groups for whom a wearable's numbers add real information.

The people who should think twice: anyone already in behavioral treatment for insomnia, anyone who notices their mood shift when they see a bad score, anyone with a clinical history of health anxiety or perfectionism. The data feeds the loop the treatment is trying to break Baron et al. 2017 Jahrami et al. 2024. The American Academy of Sleep Medicine's position is the same in clinician language: consumer trackers cannot diagnose or treat sleep disorders, but they can be useful for opening a conversation with a clinician Khosla et al. 2018.

Three adjacent topics worth knowing about. Sleep debt — what reliably short sleep actually costs you, separate from any device counting the hours. Cognitive behavioral therapy for insomnia (CBT-I), which is the strongest evidence-backed treatment for chronic insomnia and which most wearable users have never heard of. And sleep apnea itself — if you got a screening notification, the notification is the start of a clinical workup, not the end of one.

·
196