The Sequence element and Audio API

The Sequence element requires JavaScript because it is needed to access the Audio API.

Multimedia files can take up a lot of space and site bandwidth, as well as take a while to load up, especially on slow devices. My spouse had done some meditations but they took up a lot of space, yet the actual talking only amounted to a fraction of the time, and the background music could easily be loops. After a short search I came across the Audio API and it seemed to able to do what I wanted.

I wanted the Sequence element to be able to handle images as slides, along with captions, as that opened up the opportunity for the element to replace a lot of videos that were screenshots with voiceovers, enabling them to be served from a site rather than YouTube. The problem was that the Audio API only handled audio, so some smarts were needed to be able to fit scheduling images along with the audio.

Audio is played by specifying the start and end times from the start of the sequence, with an event triggered when complete. The Audio API provides an oscillator which I did not need for anything else, so I use one for each non-audio element, specifying a short duration and not connecting it to any output. I trap its ending event and that loads and displays the image or text. Timing is to the second, whereas the Audio API is much more accurate, but for slideshow purposes, seconds are fine.

Later on I realised that having a pointer would prove useful. In a video with screen shots, the mouse cursor would move around, but with still images, a pointer could move around with a lot less bandwidth than 24 frames per second. It uses the same oscillator arrangement as images and captions. With the pointer, the Sequence element is a reasonable replacement for many instructional videos, with the bonus that catering for another locale is a simple as adding its text to captions and audio to the existing file sets, rather than making a whole new video.

There are group elements that can contain elements that are sequenced relative to its start, so a group can have its start-time changed to shift all its elements together. Originally, I allowed it to be looped, but realised that could result in images and text crashing each other too easily, while making debugging difficult. I made only an audio element that was a direct child of a part element able to be looped, as that made it easy for background music loops, reducing the storage size substantially.

With the Audio API, the total duration cannot be calculated until all audio is loaded, but that could severely delay page loading if there are many such elements on a page. The files for a part are not loaded up until a part is played, with the start delayed by the greater of two seconds or the time to load the files, during which the part title is displayed. Once the image files are loaded up, they are in the browser cache, while the complete audio files are in buffers.

Originally, parts played to completion without any ability to jump around. To allow extra granularity, I allow up to 12 parts, with an Auto-next facility to automatically play the next parts. While that allowed parts to be shorter, it hardly allowed for jumping back 10 seconds to catch something that was missed. Having all the content in local memory enabled having navigation buttons that recalculate the times and immediately play from the new start time, adjusting the timecode display as required, and making sure the correct elements were displayed where they should be at the new start time.

The Audio API is recommended for audio clips that are less than 45 seconds long. The maximum timecode that can be set for any element in a part is 59:59, but any audio started will play to completion, making the total time possible for a part about 1:00:45 long, and a 12 part sequence over 12 hours long.

The Audio API has gain elements and some functions to ramp levels up and down, so audio elements have been provided with selectable fade-in and fade-out times, and selectable gain and pan settings. There is up to four channels with gain controls and a master gain. Compressors are on each channel and the master output to keep a limit on levels. Audio elements do not have a gain control, but are routed to a channel to be controlled by its gain. The facilities provided are less than what the Audio API is capable of, but I had to keep the element conceptually simple.

The audio clips for typical instructional or meditation uses can be edited to severely reduce background noise, which reduces the skill level compared to an audio engineer that would be required for normal audio mixing. A disembodied voice does not need to sound like they are in a room because there is no visuals of the person in their location for the sound to match to. Once the clips are cleaned up and imported into Smallsite Design, a person with some technical skill can set up a sequence element, or get someone to set it up, then tweak the timing and levels themselves.