Play HTML5 audio in the browser
Playing HTML5 audio natively in the browser can be a challenge. Ashley Gullen, creator of HTML5 game editor Construct 2, guides you through the maze
This article first appeared in the May issue (227) of .net magazine – the world's best-selling magazine for web designers and developers.
Traditionally, anything audio-based on the web has played sound as a Flash applet. Now HTML5 is set to take over, and with Adobe stopping development of Flash for mobile, many Flash developers are looking for new ways to do the things they used to in HTML5.
HTML5 is great, but audio is one of the areas that’s still a work in progress. So, how does one play audio in HTML5? There are several options. One is the <audio> tag, supported by all major browsers. You can call this from JavaScript as easily as new Audio('music.ogg').play(). But it’s designed for streaming music, and some browsers have issues with playback when it’s used heavily for things such as game sound effects. Here are a few more useful methods:
var a = new Audio(‘sound.ogg’); – create new audio
a.play(); – start playing
a.pause(); – stop playing
a.currentTime = 0; – rewind to beginning
a.duration; – returns length
a.ended; – returns true if finished
a.loop = true; – set looping
a.volume = 0.5; – half volume
a.muted = true; – mute
a.addEventListener('ended', func); – call ‘func()’ when finishes playing
Since it was designed for music, the <audio> tag will always stream the audio from the server, so it may start playing before being fully downloaded. For a full reference of the <audio> tag it’s best to go direct to the spec – look up HTMLMediaElement – or as always, the Mozilla Developer Network (MDN) has a good summary.
Realising the <audio> tag’s shortcomings, Google built the Web Audio API: a full audio engine with powerful routing, effects, sample-accurate playback and more.
It’s ideal for game audio, fairly straightforward and much more powerful – but takes more code to get going and is currently only a W3C proposal. It’s also only available in Chrome.
The following code shows how to use Ajax to request a sound and play it every two seconds with the Web Audio API. Note the webkit prefix will be dropped from AudioContext if it’s standardised:
// This will just be ‘AudioContext’ if standardised if (typeof webkitAudioContext == "undefined") alert("Web Audio API not supported"); // The context handles audio playback. var context = new webkitAudioContext(); // A sound buffer. var dogBarkingBuffer = null; // Load dogbarking.ogg to the dog barking buffer. var request = new XMLHttpRequest(); request.open('GET', 'dogbarking.ogg', true); request.responseType = 'arraybuffer'; // Decode asynchronously request.onload = function() { context.decodeAudioData(request.response, function(buffer) { dogBarkingBuffer = buffer; }); } request.send(); // Function to play one instance of the // dog barking from the buffer. function playDogBarking() { if (!dogBarkingBuffer) return; // not loaded yet // Buffer sources are throwaway objects var source = context.createBufferSource(); source.buffer = dogBarkingBuffer; source.connect(context.destination); source.noteOn(0); } // Set a timer to play the sound every 2 seconds. setInterval(playDogBarking, 2000);
Unlike the <audio> tag, this example can only play the sound once fully downloaded (when the Ajax request completes). But playback is instant and many sounds can be played without choking the browser, since the API was designed for this use. For more about the Web Audio API, check the HTML5 Rocks tutorial (the above code is based on this) or view the proposed spec at for the full lowdown.
Mozilla, unhappy with the Web Audio API, has proposed the MediaStream Processing API. This is a general API for audio and video and it’s still a draft. No browsers support it yet, but it has cool features such as audio capture and using a canvas as a live video stream. Firefox also supports the older Audio Data API, but it really just allows writing samples to an audio buffer. This is too low level to be of practical use when developing advanced apps such as games.
Some devs still use Flash to play audio with such shims as SoundManager 2. On mobile, native-app wrappers such as PhoneGap and AppMobi can provide their own APIs too.
These may work for you, but we’re interested in pure HTML5! So for now, the best approach is to use the <audio> tag, but you may also want to use the Web Audio API in Chrome for more reliable playback.
Get the Creative Bloq Newsletter
Daily design news, reviews, how-tos and more, as picked by the editors.
Codec circus
You may have noticed the .ogg extension in the example earlier, which was no mistake! This is the free, open Ogg Vorbis audio format, whereas Flash typically has used MP3s. Unfortunately, this brings us to another pain point in HTML5 audio: there is no single audio format that all browsers support.
There are actually five main candidates: uncompressed wave files, MPEG-1 layer 3 (MP3), MPEG-4 AAC (or just AAC for short, but sometimes also called MP4), Ogg Vorbis, and WebM. Uncompressed wave we can discount straight away since the files tend to be too large for web delivery – you’ll leave users waiting for ages for them to load! Ogg Vorbis is similar to MP3 in file sizes and quality, but unpatented and free for anyone to use. WebM is free and open like Ogg Vorbis, and actually just stores Vorbis in a different container, so let’s just consider it equivalent to Ogg Vorbis for now. Ogg Vorbis also has the advantage of being more established and well-known with wider support, since it’s now about 10 years old versus the two-year-old WebM.
MPEG-4 Advanced Audio Coding (or AAC for short) is a patented format that isn’t entirely free to use – you must pay fees and royalties if you distribute encoders or decoders. This isn’t a problem for most people developing for the web, but it can cause issues for tool makers. For example, if you want to make an HTML5 library or editor that helps you encode audio to AAC, you may need to pay. You can still privately encode AAC files and distribute them on the web for free, though – providing the patent owners don’t change their mind. Finally there’s the familiar MP3, but the licensing may make it unusable: the MP3 licensing website states there is a fee of $2500 for distributing a game using MP3 files! AAC is free to distribute in games and is a better quality format, so there’s no reason to consider MP3 and risk it – just skip on right over to AAC (see What’s wrong with MP3, overleaf for further details).
In short, the only real candidates are Ogg Vorbis and AAC. Chrome, Firefox and Opera support Ogg Vorbis. Chrome, Safari and Internet Explorer support AAC. So if you only use Ogg Vorbis you miss Safari and Internet Explorer, and if you only use AAC you miss Firefox and Opera. Whoops! Sadly the reality is you have to dual-encode all your audio to both Ogg Vorbis and AAC.
In your JavaScript, you can detect if the browser supports Ogg Vorbis with the following line:
var canPlayOgg = !!(new Audio().canPlayType('audio/ogg; codecs="vorbis"'));
If canPlayOgg is true, load the .ogg version of your files; otherwise, load the .m4a (the typical extension for AAC).
With browser-based technology, codec support is the same for all APIs (for example Chrome plays the same formats with the <audio> tag and the Web Audio API). With non browser-native solutions, supported formats will differ!
Why not one format?
There’s a bit of a standoff with browser makers at the moment over the audio format support. Mozilla rightly points out that the web is successful owing to its free and open nature. The internet may never have taken off if there were patents and fees on using the many technologies involved: HTML, the DOM, JavaScript, PNG, HTTP, TCP/IP and the browser itself, to name a few. On that basis, it refuses to add a patented format to an otherwise free and open web.
There is no technical reason that Internet Explorer and Safari don’t support Ogg Vorbis. It’s a free and open format, and any browser maker can implement it whenever it chooses to. The reason for the stalemate is purely political. Microsoft and Apple make their businesses around the proprietary. The issue stems from a bigger war over video formats, and the supported audio format will tend to be whatever comes with the video. Video is out of the scope of this article, but Apple and Microsoft probably stand to make money off the web by using the proprietary formats, hence their preference. Both sides like to suggest that plug-ins can be used to support additional formats, but obviously this defeats the plug-in free experience of HTML5, so it isn’t really an option.
So this is a mess for HTML5 developers, and something Flash developers didn’t have to deal with. It is a little inconvenient but dual-encoding to both formats solves the problem. The user will only need to download one set of files depending on their browser’s support, so it won’t waste your bandwidth.
The best format for HTML5 developers is certainly Ogg Vorbis, for the reasons Mozilla points out: you can use it freely and unrestricted without any concern over patents or licences. In fact, the HTML5 standard itself used to recommend Vorbis for audio and Theora for video in the Ogg container. But pressure from some of the companies involved forced the recommendation to be dropped and now the HTML5 standard does not mention any formats, which has allowed this situation to develop.
It’s a shame, because the free formats are undoubtedly best for the web. HTML5 developers can help the situation by pushing Microsoft and Apple to add support for Ogg Vorbis to their browsers. Make sure they know it’s your preference! With just one format covering the whole web it will be much more straightforward to get audio working in your HTML5 games and apps, and that will make life easier for everyone.
Mobile
Mobile devices’ browsers are some way behind their desktop counterparts. As well as having weaker hardware, some can only play one sound at a time – making for a limited audio experience. iOS also adds that audio can only be played on a user-initiated action such as a button press, apparently to prevent autoplaying media using too much bandwidth on metered cellular networks.
Services such as PhoneGap and AppMobi offer native-app wrappers around HTML5 apps for mobile. Not only does this enable you to publish to various phones’ app stores, they can provide ways around the device’s HTML5 audio limitations. Look into PhoneGap’s Media API or SoundPlug plug-in and AppMobi’s Player API.
Conclusion
In short, use the <audio> tag but switch to the Web Audio API in Chrome. Encode to both AAC and Ogg Vorbis and you’ll cover all browsers with as little patent/licensing worry as possible.
While it’s best to support as many browsers as possible, in some situations you may be comfortable with just one. For example, if you specifically target Google Chrome, development will be easier since you can use the Web Audio API and Ogg Vorbis audio exclusively. You’ll need to weigh the development benefits against having a smaller audience who can hear sound, though.
At this time there’s no clear direction over HTML5 audio: mobile support needs to be better and it’s not clear if the Web Audio API will be standardised. Personally I hope it does, because I’m interested in HTML5 games and the Web Audio API is great for that. Mozilla’s MediaStream API could catch on or at least become a better alternative to the <audio> tag in Firefox. Hopefully Microsoft and Apple will eventually support some form of Vorbis in their browsers, but it’s far from clear – and let’s hope neither MP3 nor AAC end up taking over instead. So with HTML5 you’re going to have to keep up to date: the second browser wars are in full swing and things are changing quickly; be on the lookout for changes and new technologies emerging
Thank you for reading 5 articles this month* Join now for unlimited access
Enjoy your first month for just £1 / $1 / €1
*Read 5 free articles per month without a subscription
Join now for unlimited access
Try first month for just £1 / $1 / €1
The Creative Bloq team is made up of a group of design fans, and has changed and evolved since Creative Bloq began back in 2012. The current website team consists of eight full-time members of staff: Editor Georgia Coggan, Deputy Editor Rosie Hilder, Ecommerce Editor Beren Neale, Senior News Editor Daniel Piper, Editor, Digital Art and 3D Ian Dean, Tech Reviews Editor Erlingur Einarsson and Ecommerce Writer Beth Nicholls and Staff Writer Natalie Fear, as well as a roster of freelancers from around the world. The 3D World and ImagineFX magazine teams also pitch in, ensuring that content from 3D World and ImagineFX is represented on Creative Bloq.