HP Labs Blog

HP Labs

HP Labs spatial audio research adds powerful presence to VR and other environments

By Simon Firth, HP Labs Correspondent

January 30, 2020

HP Labs spatial audio researchers Srikanth Kuthuru, Sam Chau, and Sunil Bharitkar

HP Labs spatial audio researchers Srikanth Kuthuru, Sam Chau, and Sunil Bharitkar

HP Labs researchers are using a novel approach anchored in artificial intelligence, signal processing, and psychoacoustics to create spatial audio for virtual reality (VR) and other media, adding a rich and immersive experience.

“At its simplest, spatial audio recreates the perception of localization using psychoacoustically-motivated signal processing techniques,” explains Sunil Bharitkar, HP Distinguished Technologist and audio research lead in HP’s Artificial Intelligence and Emerging Compute Lab. “But today, due to its perceptual transparency, technology designed at HP allows us to go much further and produce the sense that we are actually present within a virtual 3D space.”

That’s especially true when listeners use VR headphones, which allow for fine control of a listener’s auditory environment. But it’s also possible to make the audio emanating from PCs, laptops, and even smart speakers appear to be coming from very specific distances and directions.

Spatial audio is the result of more than just having better technology to play it on. Our understanding of how humans listen and speak has also vastly improved, says Bharitkar. “And because we have a better understanding of psychoacoustics, we can combine that with deep-learning and signal processing techniques in a novel way that allows us to design generalizable models that scale to arbitrary listeners/consumers and provide a fundamentally new experience,” he suggests.

This approach is being pioneered at HP Labs, where Bharitkar leads a team charged with exploring how HP can research, develop, and deploy spatial audio across a range of products  that the company offers

“Creating compelling audio experiences that take us beyond the state of the art is what we're always looking to do.”

Sunil Bharitkar, HP Labs Distinguished Technologist

HP Spatial Audio heightens the realism in VR

HP Spatial Audio heightens the realism in VR

Using machine learning to build a “universal listener”

HP Lab’s spatial audio research starts with a massive academic data set of sounds recorded at the entrance of the ears of people sitting inside a large but acoustically “dead” (i.e., echo-free) space. The microphones record the key attributes of what people hear when the sounds originate at different points around them in the real-world.

“These data points are called head-related transfer functions (HRTFs), and you can think of them as objective representations of psycho-acoustical properties of listening,” Bharitkar observes. “We can then take this information and use it as a guide for generating spatially accurate virtual sounds in 3D space.”

Because people’s head, body, and ear shapes differ, the sounds that enter their ear canals will differ for each person – to the extent that a single person’s HRTFs can’t underpin a universal model of human psychoacoustics.

The HP team therefore took data from hundreds of different listeners and applied deep learning techniques to the entire set to create a “basis” model that scales for everyone with the same experience.

The researchers also used perceptually-related metrics to optimize the model. A distortion metric, for example, measures how well the deep learning model matches the HRTFs that were captured for specific sounds coming from a specific direction and then indicates how this model would need to be improved in order to minimize any artifacts over all listeners. A paper based on this research received an Outstanding Paper Award from the IEEE at the 9th IEEE International Conference on Consumer Electronics, held in Berlin in September, 2019.

Once they know what a piece of audio should sound like for it to appear to emanate from a specific location, the team still needs to be able give that audio that specific quality. So, the final part of the HP Labs’ approach deploys a series of signal processing steps to create naturally-sounding immersive audio.

“You can have the best algorithm in the world,” notes Bharitkar. “But if you don't have proper perceptually-derived processing, you're not going to reproduce it in a way that has a powerful impact.”

HP Immersive Audio enhances the audio-video experience in personal compute (PC) devices

HP Immersive Audio enhances the audio-video experience in personal compute (PC) devices

Spatial audio for VR

Spatial audio can have an especially dramatic impact in VR, where a strong sense of presence is key to creating a truly immersive experience. Not surprisingly, VR is therefore a major area of interest for the spatial audio team.

“The main challenge here is creating directionality and naturalness of sounds,” says researcher and former HP Labs intern Srikanth Kuthuru. “For example, when people engage in VR games, they want to know the exact position of all the players and key objects in the space – really good spatial audio will give you that.”

In addition to game playing uses cases, the HP Labs team is interested in the impact of spatial audio on social VR, where people are looking to find and talk with each other. This relates to alleviating the “cocktail-party” phenomena and improving on “spatial release from masking.”

“In that situation, you might want to zoom in on a conversation,” adds HP Labs technical lead for VR Tico Ballagas. “So we are exploring the use of multisensory data (using VR sensors) to track gaze, and attention, for spatially filtering the audio from a direction of interest.”

Spatial audio without headphones

Spatial audio doesn’t rely on headphones to work, however. The HP Labs research effort extends to – and appears to work as well for - any device with stereo speakers, including laptops, pcs, and home smart speakers.

A recent spatial audio demonstration for smart speakers, recalls Bharitkar, “gave pretty much everyone in the room the experience of spatially-differentiated audio, as opposed to in the past where people had to sit in a sweet spot to hear it.”

Some of this technology, including an earlier iteration of HP Labs’ immersive signal processing technology, is already installed in HP all-in-one and laptop products. Future updates will likely see a newer set of deep learning-derived spatial processing technologies installed as well.

The research might also underpin a new software plugin for VR playback devices or become a content creation module for people building new VR services and applications.

“HP’s focus is always on providing our customers with unrivaled experiences in VR,” says Spike Huang, who leads the VR Business for HP’s Personal Systems group. “These insights and new approaches in the interdisciplinary area of signal processing, acoustics, auditory perception, and AI are essential to our being able to continue delivering on that promise.”

In addition to enabling spatially-differentiated audio reproduction on HP devices, the Labs team is exploring how to capture it too.

“We’re thinking, for example, about how a better sense of presence can improve voice conferencing,” says Kuthuru, who is currently focusing on this research area. “There you are incorporating spatial capture as well as appropriate spatial rendering so you need additional processing in place that will let you build an end-to-end ecosystem experience.”

For the HP team, the overriding motivation remains the same. “Creating compelling audio experiences that take us beyond the state of the art is what we're always looking to do,” Bharitkar says.