Songbird: Spatial Audio Encoding on the Web

Travis npm GitHub license

Songbird is a real-time spatial audio encoding JavaScript library for WebAudio applications. It allows web developers to dynamically spatially-encode streaming audio content into scalable ambisonics signal, which is rendered internally using Omnitone to output stereo sound, for realistic and quality-scalable 3D audio.

Hear Songbird in action (currently the examples only work on laptops/desktops):

The implementation of Songbird is based on the Google spatial media specification. It expects mono input to its Source instances and outputs ambisonic (multichannel) ACN channel layout with SN3D normalization. Detailed documentation may be found here.

Table of Contents

How it works

Songbird is a JavaScript API that supports real-time spatial audio encoding for the Web using Higher-Order Ambisonics (HOA). This is accomplished by attached audio input to a Source which has associated spatial object parameters. Source objects are attached to a Songbird instance, which models the listener as well as the room environment the listener and sources are in. Binaurally-rendered ambisonic output is generated using Omnitone, and raw ambisonic output is exposed as well.

Songbird Diagram

Installation

Songbird is designed to be used for web front-end projects. So NPM is recommended if you want to install the library to your web project. You can also clone this repository and use the library file as usual.

npm install songbird-audio

Usage

The first step is to include the library file in an HTML document.

<!-- Use Songbird from installed node_modules/ -->
<script src="node_modules/songbird-audio/build/songbird.min.js"></script>

<!-- if you prefer to use CDN -->
<script src="https://cdn.rawgit.com/google/songbird/master/build/songbird.min.js"></script>

Spatial encoding is done by creating a Songbird scene using an associated AudioContext and then creating any number of associated Source objects using Songbird.createSource(). Any number of AudioNodes can be connected directly to Source objects. The Songbird scene models a physical listener while adding room reflections and reverberation. The Source instances model acoustic sound sources. The library is designed to be easily integrated into an existing WebAudio audio graph.

“Hello World” Example

Let’s see how we can create a scene and generate some audio. Let’s begin by constructing an AudioContext and Songbird scene and connecting it to the audio output. You can view a live demo of this example here.

var audioContext = new AudioContext();

// Create a (1st-order Ambisonic) Songbird scene.
var songbird = new Songbird(audioContext);

// Send songbird's binaural output to stereo out.
songbird.output.connect(audioContext.destination);

Next, let’s add a room. By default, the room size is 0m x 0m x 0m (i.e. there is no room and we are in free space). To define a room, we simply need to provide the dimensions in meters (the room’s center is the origin). We can also define the materials of each of the 6 surfaces (4 walls + ceiling + floor). A range of materials are predefined in Songbird, each with different reflection properties.

// Set room acoustics properties.
var dimensions = {
  width : 3.1,
  height : 2.5,
  depth : 3.4
};
var materials = {
  left : 'brick-bare',
  right : 'curtain-heavy',
  front : 'marble',
  back : 'glass-thin',
  down : 'grass',
  up : 'transparent'
};
songbird.setRoomProperties(dimensions, materials);

Next, we create an audio element, load some audio and feed the audio element into the audio graph as an AudioNode. We then create a Source and connect the AudioNode to it. The default position for a Source is the origin.

// Create an audio element. Feed into audio graph.
var audioElement = document.createElement('audio');
audioElement.src = 'resources/SpeechSample.wav';

// Create an AudioNode from the audio element.
var audioElementSource = audioContext.createMediaElementSource(audioElement);

// Create a Source, connect desired audio input to it.
var source = songbird.createSource();
audioElementSource.connect(source.input);

Finally, we can position the source relative to the listener and then playback the audio with the familiar .play() method. This will binaurally render the scene we have just created.

// The source position is relative to the origin (center of the room).
source.setPosition(-0.707, -0.707, 0);

// Playback the audio.
audioElement.play();

Positioning Sources and the Listener

Source objects can be placed with cartesian coordinates relative to the origin (center of the room). Songbird uses a right-handed coordinate system, similar to OpenGL and Three.js.

// Or Source's and Listener's positions.
source.setPosition(x, y, z);
songbird.setListenerPosition(x, y, z);

The Source and Listener orientations can be set using forward and up vectors:

// Set Source and Listener orientation.
source.setOrientation(forward_x, forward_y, forward_z, up_x, up_y, up_z);
songbird.setListenerOrientation(forward_x, forward_y, forward_z, up_x, up_y, up_z);

Alternatively, the Source’s and Listener position and orientation can be set using Three.js Matrix4 objects:

source.setFromMatrix(matrix4);
songbird.setListenerFromMatrix(matrix4);

Room Properties

Room properties can be set to control the characteristics of spatial reflections and reverberation. We currently support the following named materials:

Creation Arguments

When constructing a Songbird scene, optional scene-related arguments may be provided to override default values.

var songbirdOptions = {
  ambisonicOrder: 1,
  listenerPosition: [1, 0, 0],
  listenerForward: [1, 0, 0],
  listenerUp: [0, 1, 0],
  dimensions: { width: 3, height: 4, depth: 5 },
  materials: { left: 'uniform', right: 'uniform', down: 'uniform',
    up: 'uniform', front: 'uniform', back: 'uniform' },
  speedOfSound: 340
};
var songbird = new Songbird(audioContext, songbirdOptions)

Likewise, when creating a new Source, source-related optional arguments may be provided:

var sourceOptions = {
  position: [0, 10, 10],
  forward: [0, 0, -1],
  up: [0, 1, 0],
  minDistance: 0.1,
  maxDistance: 1000,
  rolloff: 'logarithmic',
  gain: 0.1,
  alpha: 0,
  sharpness: 1,
  sourceWidth: 0
}
var source = songbird.createSource(sourceOptions);

See the documentation for more details on all optional arguments.

Differences to PannerNode

There are several advantages to using Songbird over WebAudio’s PannerNode.

Cost

PannerNode requires two convolutions per encoded source. But because we employ ambisonics, there is a fixed cost associated with rendering from Songbird, with nominal costs per source. Developers can adjust the desired ambisonic order (from 1 to 3) to control the majority of computational costs.

Quality

In addition to controlling computational costs, adjusting the ambisonic order controls the quality of spatialization (Higher order typically yields better direct source localization). Additionally, we offer direct ambisonic output (which bypasses the rendering), allowing developers total control over how to render their content.

Room Acoustics

Songbird comes with room modelling effects, which includes both an early and late reflection model based on the room properties. These effects are likewise ambisonically-encoded and fully spatialized.

Porting PannerNode projects to Songbird

For projects already employing PannerNode, it is fairly simple to switch to Songbird. Below is a basic PannerNode example:

// Create a "PannerNode."
var panner = audioContext.createPanner();

// Initialize properties
panner.panningModel = 'HRTF';
panner.distanceModel = 'inverse';
panner.refDistance = 0.1;
panner.maxDistance = 1000;

// Connect input to "PannerNode".
audioElementSource.connect(panner);

// Connect "PannerNode" to audio output.
panner.connect(audioContext.destination);

// Set "PannerNode" and Listener positions.
panner.setPosition(x, y, z);
audioContext.listener.setPosition(x, y, z);

And below is the same example converted to Songbird:

// Create a Songbird "Source" with properties.
var source = songbird.createSource({
  rolloff: 'logarithmic',
  minDistance: 0.1,
  maxDistance: 1000
});

// Connect input to "Source."
audioElementSource.connect(source.input);

// Connect Songbird’s output to audio output.
songbird.output.connect(audioContext.destination);

// Set "Source" and Listener positions.
source.setPosition(x, y, z);
songbird.setListenerPosition(x, y, z);

Building

Songbird uses WebPack to build the minified library and to manage dependencies.

npm install         # install dependencies.
npm run build       # build a non-minified library.
npm run watch       # recompile whenever any source file changes.
npm run build-all   # build a minified library and copy static resources.
npm run doc         # generate documentation.

Testing

Songbird uses Travis and Karma test runner for continuous integration. To run the test suite locally, clone the repository, install dependencies and launch the test runner:

npm test

Note that unit tests require the promisified version of OfflineAudioContext, so they might not run on non-spec-compliant browsers. Songbird’s Travis CI is using the latest stable version of Chrome.

Testing Songbird Locally

For the local testing with Karma test runner, Chrome/Chromium-based browser is required. For Linux distros without Chrome browser, the following set up might be necessary for Karma to run properly:

# Tested with Ubuntu 16.04
sudo apt install chromium-browser
export CHROME_BIN=chromium-browser

Windows platform has not been tested for local testing.

Acknowledgments

Special thanks to Alper Gungormusler, Hongchan Choi, Marcin Gorzel, and Julius Kammerl for their help on this project.

Support

If you have found an error in this library, please file an issue at: https://github.com/Google/songbird/issues.

Patches are encouraged, and may be submitted by forking this project and submitting a pull request through GitHub. See CONTRIBUTING for more detail.

License

Copyright © 2017 Google Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.