Issue 37

The Science Behind 3D Sound

Sound is crucial to any media experience. Researchers and creators use technologies developed from knowledge about how humans work out the direction of a sound source to make sound for gaming and virtual reality realistic and immersive. In this article, we explain how this works.

We are processing sound all the time. We can’t turn our ears off in the same way that we can close our eyes – our brains are constantly trying to work out what is around us, and how things are changing, all from the sound reaching our ears.

Using surround sound in a game gives you more information about where things are – for example, that there is something to investigate just to your left, or an enemy attacker just behind you. In any game, film or virtual reality (VR) experience, a rich sound environment will also draw you in further (known as ‘immersion’), but since we are well practised at processing sound, we can easily tell when something doesn’t sound quite right.

Our hearing system (ears, nerves and brain) notices the differences between the sound arriving at both ears to work out where the source of the sound is – for good game audio we need at least two loudspeakers, or a pair of headphones.

Our hearing system (ears, nerves and brain) notices the differences between the sounds arriving at both ears to work where sounds are coming from.

Monophonic (mono) sound played back over just one loudspeaker contains no directional information, meaning that if we wanted to move a sound around in space, we’d need to physically move the loudspeaker (not so convenient). Stereo sound, on the other hand, uses two channels. Changing the relative volume of the sound in the two separate channels means the sound can be ‘panned’ to a particular direction. For example, making the left channel much louder than the right channel makes the sound appear to be on the left.

This panning effect can be scaled up to include more loudspeakers – cinema sound typically uses 5 or 7 loudspeakers placed around the auditorium/theatre. Cinema sound is great, but it usually places special effects around the listener, with important sounds coming from the front (matching what’s on the screen). This isn’t really how we hear sound in real-life, so researchers and creators are developing other techniques. Some artists, like Karen Monid, create exciting audio experiences with sounds moving around using lots of loudspeakers around the listener. The sphere of 50 loudspeakers in the AudioLab, University of York allows us to put sounds all around a listener, including up and down. The sound produced is really immersive, but not many people have space for so many loudspeakers at home.

^ Credit: University of York / Alex Holland

Playing stereo sounds over your headphones will give you some impression of left-right direction, but can’t place sounds above or below your head. More realistic 3D sound over headphones uses a number of clever processing techniques that mimic the way that our ears process real sound sources – using ‘binaural’ (two ear) cues, like the amplitude panning.

There are two main binaural cues: ‘inter-aural level difference’ and ‘inter-aural time difference’ – the sound is louder for one ear than the other and arrives at one before the other, since the sound has to travel around your head.

Your head and ears, particularly the sticky-out bits called the pinnae, not only obstruct, but also shape and filter the sound as it travels around your head.

Your head and ears, particularly the sticky-out bits called the pinnae, not only obstruct, but also shape and filter the sound as it travels around your head. We can measure this ‘shaping’ effect in the lab to capture a ‘head-related transfer function’ (HRTF), and apply this as a filter to the sounds we want to include in our video game. Filtering sound is similar to filtering chemicals, or coffee. What comes out the other side of the filter is different to what we put in. When sound is filtered, different parts of the sound are dampened and some are boosted.

Filtering sound with HRTFs turns recorded sound with no spatial information into sounds that are more like real-life sounds. Imagine a bird sitting in a tree – if we have a simple audio recording of a bird, we can use this HRTF processing technique to make the birdsong appear to be coming from high up above you. We can also add sounds together to make more complex sound environments, such as dense forests, underwater or busy street scenes. Researchers are using 3D sound for a huge range of things – in entertainment, product development, health and medical technologies. So, next time you put on a VR headset, play a computer game with headphones or watch a 360 video on YouTube, try to notice all the different sounds around you, and remember that they’ve been put there using clever audio processing techniques.

Here are some examples!

Simple binaural demos from the AudioLab

“Urban Melt in Park Palais Meran” by Natasha Barrett (3D composition mixed in binaural)

“Animal” by Karen Monid & Ross Ashton (quadraphonic installation mixed to stereo)
Binaural environment recordings (NPR)

Find out more about the University of York:
Research at the AudioLab
Music Technology undergraduate programmes (University of York Electronic Engineering)

WRITTEN BY

Dr Jude Bereton & Dr Kat Young
Department of Electronic Engineering, University of York

Learning resource

We have created learning notes to assist students and educators to further investigate the topics covered in this article. You can download the learning resource here »

Explore other articles

Calling all climate champions!

What have the oceans got to do with the weather?

Fire and the crazy world of air around us