JUKEBOX.SI
Not to be confused with JUKEBOXW.SI
JUKEBOX.SI is an Interleaf File containing 60 audio tracks of music and radio dialogue. With a few exceptions[citation needed], all of LEGO Island's music is stored in JUKEBOX.SI.
Details
As an Interleaf file, JUKEBOX.SI is a container for a large number of asset files. Most of these files are background and radio music (including the radio voices) in Microsoft WAV audio, however there are also 4 FLIC video files for each of the building rooms (dune buggy, jetski, helicopter, and race car). These FLC files are the small instructional videos played in the screen and are interleaved together as "movies" that are equal length (thus looping together).
All audio in JUKEBOX.SI (like most of LEGO Island's audio assets) is mono uncompressed PCM. While the majority of tracks are sampled at 11025 Hz/16-bit, a handful are sampled at 22050 Hz/8-bit. These can be distinguished by each track's WAV header (the "fmt " section if you're familiar with WAV headers) left intact in the first chunk of their respective streams.
When replacing, the WAV "fmt " header can be directly transplanted over the existing data in said first chunk. All WAV formats are compatible. The PCM data can be transplanted too, but it must be interleaved into chunks.
Technical Information
Music appears to begin with a MxDa
and is split into chunks of MxCh
. The MxDa
header contains information about the PCM audio in the MxCh
chunks. The first MxCh
appears to be information about the remainder of the chunks in the MxDa
structure.
All bytes are little endian as is normal for RIFF-based files.
Extracting Audio
- Audio streams can be located in JUKEBOX.SI by searching for " WAV" (note the prepending space).
- A few bytes before the " WAV" will be the original filename of the WAV file prior to being imported into the SI file if you wish to retrieve that too.
- A few bytes later will be "LIST" which appears to specify an array (or "list") of chunks that make up one audio track. The next 4 bytes will by a 32-bit integer for the total size of this "LIST", in other words the total amount of upcoming bytes of the SI file that belong to this particular audio track.
- The first MxCh after the "LIST" will contain WAV-compatible header data, most of which can be transplanted directly into a WAV file (see below for details).
- Every MxCh after this one will contain PCM audio data (formatted according to the header data in the first MxCh). Each MxCh has a 22 byte header that will need to be stripped out when extracting. After the 4 byte "MxCh" identifier, the header contains a 4 byte integer of the total amount of bytes that the chunk takes up (minus 8 bytes for the "MxCh" identifier and chunk size integer). All data after this 22 byte header is PCM audio that will be exactly "chunk size - 14" bytes in size (14 is the size of the 22 byte header minus the first 8 bytes).
- Each MxCh's data can be dumped until you reach the end of the "LIST" size extracted above. At that point the end of the track has been reached and the process must be repeated to extract the next track.
Header
NOTE: This information is incomplete and requires more research and information.
As mentioned above, the first MxCh in a "LIST" contains solely header data. Most of this data is completely compatible with the specification for WAV.
Bytes | Offset | Description |
---|---|---|
MxDa |
0 | Identifier |
MxCh |
4 | Chunk Header |
Chunk Size | 8 | 4-byte Integer |
Sub-Chunk Size | 22 | 4-byte Integer - The remaining size of this chunk after this value |
Audio Format | 26 | 2-byte Integer - 1 = PCM, others indicate some form of compression |
Number of Channels | 28 | 2-byte Integer - 1 = Mono, 2 = Stereo |
Sample Rate | 30 | 4-byte Integer |
Byte Rate | 34 | 4-byte Integer - is equal to Sample Rate * Number of Channels * BitsPerSample/8
|
Bytes per Sample | 36 | 2-byte Integer - is equal to Number of Channels * BitsPerSample/8
|
Bits per Sample per Channel | 38 | 2-byte Integer - 8 = 8-bit, 16 = 16-bit, etc. |
Transplanting the Header
Use this WAV File Format Header Specification and you'll be able to determine the 16 bytes from "Audio Format" to "Bits per Sample per Channel" are identical. This makes up most of the WAV header data (apart from file and chunk size which cannot be determined from here) and can be directly transplanted to make extraction easier and ensure the sample rate and sample size are correct in the extraction.
Note that the MxCh header contains a few more bytes after "Bits per Sample per Channel" and therefore its "Sub-Chunk Size" is larger than the average WAV file's. These extra bytes should be ignored and not transplanted, though if they are the "Sub-Chunk Size" should be translated too (or at least increased to accommodate for them).