While it is still in beta, the game audio world is excitedly testing out Unreal Engine 5’s brand new feature: MetaSounds.
I spent a few weeks diving in and learning as much as I could about what is new and what this means for the future of game audio.
Game audio has always been at the mercy of the prevailing technology. Within the past couple of decades or so, we have seen powerful middleware tools emerge (such as WWise and FMOD). These tools unify and bring complex audio manipulation techniques to multiple game engines. They allow audio programmers deeper functionalities, without having a developer reinvent the audio system from scratch each time.
Epic introduced its Audio Mixer engine with Unreal Engine 4.24. This was powerful compared to earlier versions of Unreal, but still incredibly limited when compared to middleware engines.
With Unreal 5, we have MetaSounds, furthering the question on many developer’s minds: Do I need to pay to license audio middleware?
The answer (of course) is complicated.
First things first, MetaSounds is incredibly powerful. There are a wide variety of complex audio techniques, many of which share a lot of similarities to workflows in node-based audio tools such as MAX/MSP or supercollider. It is particularly powerful at creating and manipulating small bits of sound on the fly. MetaSounds struggles when we get into deeply layered, pre-recorded audio files.
People have already easily and natively made complex synthesizers using MetaSounds:
Some people are finding inventive ways to sequence musical sequences natively in MetaSounds: https://twitter.com/echolevel/status/1399830093464326145
Let’s dive under the hood a little bit to see what is new:
1. Sample Accurate Audio Control
Unreal’s previous audio engine was not sample-accurate. This meant less control over the audio and more audio latency. With MetaSounds, we can natively change the frequency of the audio file playing, dynamically.
2. Individual Audio Rendering Engines: MetaSounds turn audio systems into individual systems to be either self-contained or to communicate with each other. This means each MetaSound can be self-contained.
3. Workflow: A major overhaul is to have nested MetaSounds. We can build systems of MetaSounds within MetaSounds. This allows us for greater control and organization of our elements.
4. Performance: Because each Metasound is rendered on its own using C++ object, it is not reliant on the main Audio Mixer the way previous systems were.
Let’s see what we might be missing when compared to other audio engines.
(Some of this might be changing, as this is a beta project, it is possible that these functions will be available soon, or that the documentation might need to clarify certain functions)
1. Audio File Support: As far as I can tell, MetaSounds can render in Mono or Stereo. This means that we are missing 5.1, 7.1, ambisonic, and other multi-channel configurations that WWise and FMod frequently handle.
2. Complex Musical Interactions: Both WWise and FMOD are built to be able to switch to almost DAW-like workflows, with multiple layers of audio files working together. Currently MetaSounds’ wave player does not support DAW-like layers of tracks.
3. Cross-Engine Workflow: An obvious one, but one of the large benefits of WWise and FMOD is that audio programmers can hook them to many game engines.
As a game-audio professional with a programming/synthesizer background, I felt immediately right at home in MetaSounds. Within a few minutes, I had made a simple synthesizer, and with more time I was able to play it using a MIDI controller. If we are talking about sample-based/synthesizer-based/or other node-based sound design or music, I could see a tremendous amount of audio programmers feeling right at home just using MetaSounds. For certain games forgoing the licensing fees that WWise or FMOD would make a lot of sense. There are even certain types of audio that might even easier to do in MetaSounds. There are ways to do similar synthesized sounds in WWise and FMOD, but it isn’t always as native and easy as what we get in MetaSounds.
I would still reach to pay the license fee for WWise or FMOD in the following places:
1. Complex Pre-Recorded Music Systems: If I were creating a musical environment with many pre-recorded wave files interacting with each other, I would probably be fairly nervous at the prospect of creating it in MetaSounds. A DAW-like, track-based workflow is a comforting one to me as a composer when dealing with multi-layered music.
2. Dialogue-heavy games: Dialogue is already quite an undertaking in terms of file management, and engines such as WWise excel in this capacity.
Every day, we are forced to ask ourselves if we should shell out money for new and more expensive tools. My primary advice to anyone with this question is to push our current tools to their limit, and only then when we have found the limit, consider buying something that can go farther. With MetaSounds, the limit got a lot further out of the box in Unreal. I suspect a lot of games in the future will be able to do all of the audio they want to in MetaSounds, without ever reaching for WWise or FMOD.