Blog

Multimodal Annotation Case Study: Gamified data labeling boosts model accuracy from 70% to 93% for Eight Sleep

Ali Devaney, Marketing
July 26, 2024

The Centaur Labs crowd annotates complex spectrograms and chest vibration waveforms to enable Eight Sleep’s Snore Detection model.

People used to measure sleep by a single metric: the number of hours in bed each night. Sleep scientists have long known quality of sleep matters as much as quantity and a wide variety of personal sleep tracking devices are available for consumers to optimize their sleep. Eight Sleep, the world’s first sleep fitness company, is at the forefront of this sleep optimization movement, developing innovative hardware, software, and AI technologies to improve sleep quality.

We spoke with John Maidens, Machine Learning Lead at Eight Sleep, to discuss the importance of sleep for overall health, the complexity of annotating snore data, and the role Centaur Labs played in advancing the company’s snore detection algorithm. Read on to learn how our crowd of experts generated multimodal annotations leveraging spectrograms and waveforms to help Eight Sleep bring Snore Detection to market. 

About Eight Sleep 

Founded in 2014 in New York City, Eight Sleep launched the world’s first smart mattress with features like a smart alarm, sleep tracking, and temperature control. The company has a reputation for its innovative approach to sleep fitness and has received accolades from Fast Company and TIME for its groundbreaking innovations. Over the last decade, Eight Sleep evolved its product line from a smart mattress to an AI-powered intelligent sleep system designed to fit onto any mattress to maximize restorative sleep with seamless biometric data tracking.

The company’s latest innovation, the Pod 4 released in 2024, offers new features like touch control, improved biometric sensors, an adjustable base for custom elevation, and snore detection. Pod 4’s Autopilot feature runs complex algorithms from the data collected from the Pod to measure sleep fitness in real-time and respond to sleep disturbances throughout the night. 

The Challenge: Snore attribution and scaling the proof of concept 

Eight Sleep recognized the diagnostic value of snore detection and set out to make sleeping on the Pod 4 the best way to detect and mitigate snoring in real time. Not only does snoring indicate low-quality sleep for the snorer and disturb the sleep of partners, but it’s also a sign of sleep disorders like sleep apnea and a risk factor for cardiovascular diseases. 

With effective and accurate snore detection, Eight Sleep can help users get better sleep in the short term and potentially alert them of possible health issues that may require clinical intervention. 

“We think that if we can first detect and then help people improve their snoring, it can have a huge additional impact on health.”

For users who sleep alone, an AI snore detection model is somewhat straightforward, but snorers who share a bed with a partner (or pet!) pose an interesting attribution challenge for the company. The team needed to know which side of the bed snoring was coming from for the snore detection algorithm to work properly. 

The Pod 4 is equipped with highly responsive biometric sensors on both sides of the bed to detect small movements and vibrations of the chest from users as they sleep. Machine learning engineers at Eight Sleep combined these multimodal sensor datasets and manually annotated a small sample to build a working proof of concept. They also tried hiring a dedicated annotator and working with an annotation service provider, but the former wasn’t efficient and the latter produced low-quality labels that wouldn’t allow them to scale with confidence.

“We were able to develop a proof of concept on our own, but it quickly became clear our annotation approach just wasn't scalable for a number of reasons, like recruiting the annotators.”

Our Solution: Gamified multimodal annotation by Centaur Labs

To scale and solve for attribution, Eight Sleep wanted to leverage a larger dataset that included more of the breadth of data generated by the Pod 4’s sensors. They needed a multimodal annotation solution equipped to set up complex labeling tasks and a pool of skilled experts with incentive structures aligned to quality. 

The Centaur Labs data labeling solution supports both of these needs by combining collective intelligence and gamification. We achieve speed with 50k+ engaged expert labelers, and our gamified labeling app incentivizes and motivates labelers to compete against themselves and others to enhance quality.

“One of the things I think is really cool about the Centaur Labs methodology is how good it is at motivating people to care about doing a good job of the task.”

Centaur Labs worked with Eight Sleep to create a multimodal labeling task leveraging 3 different data types - an audio representation of the snoring vibration, a spectrogram visual of those vibrations, and a waveform of respiratory patterns. Labelers viewed and analyzed this data from both sides of the bed, looking for signals to help them classify whether the snoring was coming from the left, right, both or neither side of the bed. 

They compared respiratory waveforms from both sides of the bed with snore vibrations visualized on a spectrogram to see if the vibrations aligned most with one of the two breathing cycles. They also compared loudness markers in the spectrogram, to see if the snores were louder on one side of the bed or the other.

Centaur Labs classified Eight Sleep’s 4000 snore cases at an average of 1,000 cases per day. Collecting more than 53,000 high quality reads from our global network of skilled labelers, for an average of 8 qualified opinions per case, the Centaur Labs methodology achieved high quality, with 99% agreement with the ground truth reference cases.

The Result: 93% model accuracy and 10x annotations 

With this new data, Eight Sleep improved its model’s accuracy from 70% to 93% at detecting snores and assigning them to the correct side of the bed. The snore detection model and corresponding mitigations are so high quality now that they are clinically proven to reduce snoring by up to 45%, differentiating the new Pod 4 from other intelligent mattresses.

“It was getting the reliable high quality labels - and a lot more of them - that was able to unlock this improvement in model accuracy from 70 to 93%”

Not only did they hit this model quality bar, but they did it under significant time constraints. It was essential that the snore detection model was production ready for the Pod 4 hardware product launch. The Eight Sleep AI team was able to get the data from the prototype devices, annotate it quickly with Centaur Labs, and build the machine learning models in time for the product launch. 

“The Centaur Labs platform helped us deliver this snoring model quickly. We went from conception to in front of customers in under a year, which is very quick for a biosignal AI.” 

Finally, the inter-labeler agreement metrics in the Centaur Labs quality control dashboard helped Eight Sleep discover edge cases their model had incorrectly classified as snores, e.g. digestive sounds, wheezes, sighs, as well as rare events, e.g. when snoring is happening on both sides of the bed. By looking at cases with low inter-labeler agreement, they could systematically identify classes of data where their model was underperforming, and create a plan for ongoing model improvements. 

“Working with Centaur Labs for annotations means we're able to focus on the part of annotation management that we should, which is reviewing edge cases, and doing focused manual quality control.”

Unlock annotations at scale with Centaur Labs

The Eight Sleep team has an ambitious vision to build even more precision into their models, identifying each individual event across all their core sensor metrics, from individual snores, to heart beats. This can unlock insights that enable researchers at Eight Sleep and across the healthcare ecosystem to accelerate the development of interventions, medicines and devices that enable improved sleep quality. Eight Sleep knows Centaur Labs will be able to help with the granular annotations to support this vision as they scale.

They also envision building automatic systems to help their team manage algorithm problems in the field. With the flexibility and on-demand nature of the Centaur Labs platform, issues could be escalated and classified in real time by Centaur Labs, minimizing the manual triage work for the Eight Sleep AI team. 

The flexibility, scale, and quality of the Centaur Labs solution paves the way for a productive collaboration into the future to accelerate Eight Sleep’s vision. If your AI development requires scalable, high-quality data annotation, get in touch to learn more about how Centaur Labs can help. 

Related posts