Sound Pipeline Tools

Introduction

The "tool" portion of the AX Sound Pipeline converts standard, artist-generated WAV or AIFF files into the SP data format (SPD). Each file contains a single sound effect to be used by the game. Multiple sound effects are packed into a single SP data file to simplify loading and manipulation of the sound data during runtime.

The tool also generates reference information associated with each SPD file. By using the SP runtime library (see "Game engine programming with the AX Sound Pipeline"), games can reference the sound effects packed within the SPD file.

The import process is controlled by a script that the tool parses for filenames, sample attributes, and conversion directives. The import process supports the following conversion directives:

Architecture

The Sound Pipeline tool consists of a single executable (sndconv.exe) and two Windows libraries (SOUNDFILE.DLL and DSPTOOL.DLL).

Basic architecture and data flow for sndconv

The SOUNDFILE DLL encapsulates functions for reading and writing WAV and AIFF files. Developers can extend this library to support other formats.

The DSPTOOL DLL handles compression of PCM samples into the Nintendo Cafe DSP-ADPCM format.

NOTE:
The compression algorithm is proprietary and therefore source code is not provided for this module.

The sndconv program encapsulates the remainder of the tool functions:

This MAN page will describe the SOUNDFILE, DSPTOOL, and sndconv modules in greater detail below.

The SOUNDFILE DLL

SOUNDFILE is a WIN32 dynamic link library (DLL). This library abstracts the task of reading and writing sound files into a high-level API.

This library currently supports:

Developers are free to extend this library to support other file formats.

Source Code

The source code to build the DLL is not provided with the SDK. There are four prebuilt DLLs; their names, configuration and locations are given below in the table:

soundfile.dll Release $CAFE_ROOT/system/bin/win32/
soundfileD.dll Debug $CAFE_ROOT/system/bin/win32/
soundfile64.dll Release $CAFE_ROOT/system/bin/win64/
soundfileD64.dll Debug $CAFE_ROOT/system/bin/win64/

Header file soundfile.h is also released in the above two directories that can be used to build an application that uses soundfile dll.

NOTE:
Too invoke the SOUNDFILE.DLL from an application, you must include the soundfile.h header file in the application.

Data Abstraction

The SOUNDFILE library defines the SOUNDINFO structure as an internal, intermediate descriptor for sound data as it traverses the Sound Pipeline.

NOTE:
The SOUNDINFO structure does not encapsulate the sound data itself. The actual sample data are stored in another buffer, provided by the calling application.

The structure is defined below (given in soundfile.h):

typedef struct
{
	
    int     channels;           // Number of channels
    int     bitsPerSample;      // Number of bits per sample
    int     sampleRate;         // Sample rate in Hz
    int     samples;            // Number for samples
    int     loopStart;          // 1 based sample index for loop start
    int     loopEnd;            // 1 based sample count for loop samples
    int     bufferLength;       // buffer length in bytes

} SOUNDINFO;
Parameter Description
channels Specifies the number of interleaved sound channels present in the sound data. MONO data have only 1 channel, while stereo data have 2.
bitsPerSample Specifies the size of each individual sample. Currently, only 8 or 16-bit sample sizes are supported.
sampleRate The base sampling frequency of the sound data, in Hz.
samples Specifies the number of samples, per channel.
loopStart Specifies the sample at which a loop, if any, begins.
NOTE:
Samples are counted from 1. If no loop exists in a file, then this value is set to the first sample (1).
loopEnd Specifies the sample at which a loop, if any, ends.
NOTE:
Samples are counted from 1. If no sample exists in a sound file, then this value is set to the last sample in the file. bufferLength specifies the length of the buffer required to hold the sample data, in bytes.

Apart from the SOUNDINFO structure, the header file provides the prototype of the functions defined and typedefs for the function pointers that the application can use.

Functions

The SOUNDFILE DLL exports the following functions:

Using SOUNDFILE

The following code sample illustrates how to load the SOUNDFILE DLL from an application:

#include "soundfile.h"

void main(void)
{
    HINSTANCE   hDLL;
    soundFileFnType1   getSoundInfo;			// soundFileFnType1 is defined in soundfile.h
    soundFileFnType2   getSoundSamples; 			// soundFileFnType2 is defined in soundfile.h
    soundFileFnType3   writeWaveFile, writeAiffFile;	// soundFileFnType3 is defined in soundfile.h

    // load up DLL
    if (hDLL = LoadLibrary("soundfile.dll"))
    {
	if (!(getSoundInfo = (soundFileFnType1)GetProcAddress(hDLL, "getSoundInfo")))
	{
	    printf("GetProcAddress error\n");
	    return;
	}

	if (!(getSoundSamples = (soundFileFnType2)GetProcAddress(hDLL, "getSoundSamples")))
	{
	    printf("GetProcAddress error\n");
	    return;
	}

	if (!(writeWaveFile = (soundFileFnType3)GetProcAddress(hDLL, "writeWaveFile")))
	{
	    printf("GetProcAddress error\n");
	    return;
	}

	if (!(writeAiffFile = (soundFileFnType3)GetProcAddress(hDLL, "writeAiffFile")))
	{
	    printf("GetProcAddress error\n");
	    return;
	}
    }
else
    {
	printf("Cannot load soundfile.dll\n");
	return;
    }

    // do stuff here

    // free the library once you are done
    if (hDLL)
	FreeLibrary(hDLL);
}

In general, the SOUNDFILE DLL provides an API and abstraction layer for reading and writing sound data to and from arbitrary data formats.

Basic data flow of SOUNDFILE DLL

Some notes about the SOUNDFILE data abstraction layer:

The DSPTOOL DLL

DSPTOOL is a WIN32 dynamic linked library (DLL). It provides an API for encoding and decoding 16-bit PCM samples to and from the DSP-ADPCM compression format.

The DSP-ADPCM sample format provides (approximately) 3.5:1 compression and is proprietary to the Nintendo Cafe audio DSP. The audio DSP contains special hardware to decompress DSP-ADPCM samples "for free."

Source Code

The source code to build the DLL is not provided with the SDK. There are four prebuilt DLLs; their names, configuration and locations are given below in the table:

dsptool.dll Release $CAFE_ROOT/system/bin/win32/
dsptoolD.dll Debug $CAFE_ROOT/system/bin/win32/
dsptool64.dll Release $CAFE_ROOT/system/bin/win64/
dsptoolD64.dll Debug $CAFE_ROOT/system/bin/win64/

Header file dsptool.h is also released in the above two directories that can be used to build an application that uses dsptool.dll

Data Abstraction

The DSPTOOL library defines the ADPCMINFO structure to describe sound data encoded in the DSP-ADPCM format. The structure is given below (in dsptool.h):

typedef struct
{

// initial state
        s16 coef[16];
        u16 gain;
        u16 pred_scale;
        s16 yn1;
        s16 yn2;

// loop context
        u16 loop_pred_scale;
        s16 loop_yn1;
        s16 loop_yn2;

} ADPCMINFO; 

Notes

Apart from the above structure, the dsptool.h also defines the prototypes for the functions in dsptool.dll and the typedefs ( dspToolFnTypeXfor the function types to make it accessible for the application to use.

Functions

The DSPTOOL library exports the following functions:

Using DSPTOOL

The following code sample illustrates how to load the DSPTOOL DLL from an application:

#include "dsptool.h"

static HINSTANCE hDll;

   // dspToolFnType1, dspToolFnType2, dspToolFnType3, dspToolFnType4 n dspToolFnType5 are defined in dsptool.h
dspToolFnType1 getBytesForAdpcmBuffer;
dspToolFnType1 getBytesForAdpcmSamples;
dspToolFnType1 getBytesForPcmBuffer
dspToolFnType1 getBytesForPcmSamples;
dspToolFnType1 getSampleForAdpcmNibble;
dspToolFnType1 getNibblesForNSamples;
dspToolFnType2 getBytesForAdpcmInfo;
dspToolFnType3 encode;
dspToolFnType4 decode;
dspToolFnType5 getLoopContext;

/*--------------------------------------------------------------------------*/
void clean_up(void)
{
    if (hDll)
    FreeLibrary(hDll);			
}

/*--------------------------------------------------------------------------*/
int getDll(void)
{
    hDll = LoadLibrary("dspadpcm.dll");

    if (hDll)
    {
        if (!(getBytesForAdpcmBuffer =
       	      (dspToolFnType1)GetProcAddress(
       		      hDll,
       		      "getBytesForAdpcmBuffer"
       		      ))) return 1;
   
        if (!(getBytesForAdpcmSamples =
       	 (dspToolFnType1)GetProcAddress(
       		      hDll,
       		      "getBytesForAdpcmSamples"
       		      ))) return 1;

        if (!(getBytesForPcmBuffer =
       	 (dspToolFnType1)GetProcAddress(
       		      hDll,
       		      "getBytesForPcmBuffer"
       		      ))) return 1;

        if (!(getBytesForPcmSamples =
       	 (dspToolFnType1)GetProcAddress(
       		      hDll,
       		      "getBytesForPcmSamples"
       		      ))) return 1;
   
        if (!(getNibblesForNSamples =
       	 (dspToolFnType1)GetProcAddress(
       		      hDll,
       		      "getNibbleAddress"
       		      ))) return 1;

        if (!(getSampleForAdpcmNibble =
       	 (dspToolFnType1)GetProcAddress(
       		      hDll,
       		      "getSampleForAdpcmNibble"
       		      ))) return 1;

        if (!(getBytesForAdpcmInfo =
       	 (dspToolFnType2)GetProcAddress(
       		      hDll,
       		      "getBytesForAdpcmInfo"
       		      ))) return 1;

        if (!(encodeLittleEndian =
       	 (dspToolFnType3)GetProcAddress(
       		      hDll,
       		      "encode"
       		      ))) return 1;

        if (!(encodeBigEndian =
       	 (dspToolFnType4)GetProcAddress(
       		      hDll,
       		      "decode"
       		      ))) return 1;

        if (!(getLoopContext =
       	 (dspToolFnType5)GetProcAddress(
       		      hDll,
       		      "getLoopContext"
       		      ))) return 1;

        return(0);
    }

    return(1);
}

/*--------------------------------------------------------------------------*/
void main (void)
{
        if (getDll)
        {
                clean_up();
                exit(1);
        }

// do stuff here

        clean_up();
}

The DSPTOOL library provides services for encoding and decoding sound data, as well as functions for calculating loop contexts, determining ADPCM addresses for loop points, and counting the number of bytes in ADPCM encoded data.

Basic data flow of the DSPTOOL DLL

Notes on Using the DSPTOOL Library and ADPCM Samples in General

The sndconv Program

The sndconv.exe tool is a WIN32 command-line application. It uses the SOUNDFILE and DSPTOOL libraries to import and compress sound data from standard WAV and AIFF files.

The tool provides a scripting interface so that users can specify a multitude of sound files to import. The program packs these sound files together into a form that an application can manipulate during runtime.

The tool generates three output files:

Source Code

The source code to build the executable is not provided with the SDK. There are four prebuilt exes; their names, configuration and locations are given below in the table:

sndconv.exe Release $CAFE_ROOT/system/bin/win32/
sndconvD.exe Debug $CAFE_ROOT/system/bin/win32/
sndconv64.exe Release $CAFE_ROOT/system/bin/win64/
sndconv64D.exe Debug $CAFE_ROOT/system/bin/win64/

Program Flow and Implementation Notes

In general, the tool operates like this:

  1. Upon execution, the tool parses any command line arguments and searches for the specified script file.
  2. The tool maintains a pair of data structures (SNDCONVDATA and ADPCMINFO) that describe the sound file it is currently processing. The tool clears the data structures upon encountering a BEGIN command.
  3. The tool updates the various parameters of SNDCONVDATA as it reads attributes within the BEING-END clause.
  4. Upon encountering an END command, the tool executes the specified conversion operations (if any) and writes the converted sound data to the SPD file. The contents of SNDCONVDATA are added to a table, which will be used to generate the SPT file when script processing is complete.
  5. If a sound file is converted to ADPCM, then the file’s associated ADPCMINFO data are stored in another table. This table will be appended to the SPT file when script processing is complete.
  6. The tool continues to process the script, repeating steps 2-5 until the end of the script is reached.
  7. Upon reaching the end of the script, the accumulated entries for SNDCONVDATA and ADPCMINFO are written to the SPT file.
  8. The tool generates the C header file based on the accumulated SNDCONVDATA information.

Other Notes

Data Abstraction and File Formats

Each sound effect processed by the sndconv script is described by an entry in the SPT table. Each entry is a data structure of type SNDCONVDATA, defined below:

typedef struct
{
    u32 type;

#define SP_TYPE_ADPCM_ONESHOT   0
#define SP_TYPE_ADPCM_LOOPED    1
#define SP_TYPE_PCM16_ONESHOT   2
#define SP_TYPE_PCM16_LOOPED    3
#define SP_TYPE_PCM8_ONESHOT    4
#define SP_TYPE_PCM8_LOOPED     5

    u32 sampleRate;
    u32 loopAddr;
    u32 loopEndAddr;
    u32 endAddr;
    u32 currentAddr;
    u32 adpcm;

} SNDCONVDATA;
Parameter Description
type Specifies the bit size of each sample in the sound effect and whether the sound effect is looped.
sampleRate Specifies the base sampling frequency, in Hz.
loopAddr Specifies the address at which the loop, if any, begins.
loopEndAddr Specifies the address at which the loop, if any, ends.
endAddr Specifies the address of the last sample in the sound effect.
currentAddr Specifies the address of the first sample in the sound effect.
ADPCM An index into the ADPCMINFO table that is appended to the SPT file. If the sound effect is an ADPCM-encoded sample, the index points to the relevant entry in the ADPCMINFO table.
NOTE:
The addresses in this data structure are offsets into the SPD file. The addressing mode of each address will vary depending on the sample type. For more information, see "Sample addressing".
typedef struct
{
    // initial state
    u16 coef[16];
    u16 gain;
    u16 pred_scale;
    u16 yn1;
    u16 yn2;

    // loop context
    u16 loop_pred_scale;
    u16 loop_yn1;
    u16 loop_yn2;
} ADPCMINFO; 

Each ADPCMINFO entry describes the decoding parameters for a given ADPCM-encoded sound effect. The table of ADPCMINFO data is appended to the end of the SPT file.

The format of the SPT file is illustrated below.

Format and internal references of SPT files

The SPT file is prefaced by an integer (32-bit, unsigned, big-endian), which indicates the number of SNDCONVDATA entries that are present in the file.

Immediately following the SNDCONVDATA table is another table, which captures the ADPCMINFO data for all sound effects that have been ADPCM encoded.

NOTE:
The number of ADPCMINFO entries is unspecified; however, the remainder of the file contains only the ADPCMINFO table.

In the illustration above, sound effect entries 1 and 3 are ADPCM encoded, and therefore point to the first and second ADPCMINFO entries, respectively.

Each entry in the SNDCONVDATA table corresponds to a sound effect packed into the associated SPD file. A macro definition for each sound effect is placed in the C header file generated by sndconv. The mapping between the sndconv output files is shown below:

Mapping between header file, SPT file, and SPD file

Some notes on the data format of the output files:

Sample Addressing, Alignment, and Loop Point Specification

Perhaps the most vexing aspects of managing sound effects are:

  1. Sample addressing

    The DSP contains special hardware to automatically read and, if necessary, decode samples from main memory. The addressing mode of this hardware as it reads from main memory depends on the type of sample being read.

    Sample Encoding Scheme Addressing Mode
    8BIT PCM Byte
    16BIT PCM Word (16-bit)
    ADPCM Nibble (4-bit)

    Associated with each sound effect are several addresses that describe its beginning, end, and loop points:

    Address Description
    loopAddr Address of first sample in loop. If the sound effect is not looped, this is zero.
    loopEndAddr Address of last sample in loop. If the sound effect is not looped, this is this is zero.
    currAddr Address of first sample of sound effect.
    endAddr Address of last sample of sound effect.

    Each address will access either a byte, word, or nibble, depending on the encoding scheme of the samples.

    When generating the SPT table entry for a sound effect, sndconv will reconcile the sample type and addressing mode and write the appropriate values into the address parameters.

    NOTE:
    These addresses are actually offsets into the SPD file. This is because the sndconv tool cannot predict where in memory the sound effects data will be placed. The Sound Pipeline runtime library (SP) must therefore update each address value by adding the absolute base address at which the SPD data are loaded. The runtime library will reconcile the absolute base address with the addressing mode of each sound effect.
  2. Sample alignment in memory

    Another hazard to consider is the alignment of sound effects in main memory. The following requirements exist:

    Sample Encoding Scheme Required Main Memory Alignment
    8BIT PCM sound effect must start on 8-bit boundary.
    16BIT PCM sound effect must start on 16-bit boundary.
    ADPCM sound effect must start on 64-bit boundary.
    NOTE:
    Data must be transferred into main memory at 32-byte boundaries. Based on this restriction, sndconv automatically pads sound effects to preserve the required alignments.
  3. Loop point specification

    Yet another hazard to consider is the specification of loop points.

    Loop markers/points are most commonly specified in samples. That is, if a sound effect has a loop-start marker at n, then the nth sample (counting from zero) is the first sample played within the loop. The loop-end marker is indicated similarly: if the end marker is at m, then the mth sample (counting from zero) is the last sample played within the loop. This is the convention used by many commercial sound editing applications.

    Note, however, that AIFF files encode loops differently. Loop start markers are stored as the first sample in a loop, counting from zero. But the end marker is the sample AFTER the last sample played in the loop.

    The DSP also accesses loop points differently: its decoding hardware expects loop points to be specified as addresses.

    The AX Sound Pipeline reconciles these differences when importing sound effects. Note the calculations carefully when modifying SP or creating your own tool path.

    Incorrect loop points may cause discontinuity artifacts in the sound output (such as ‘clicks’ or ‘pops’). If the loop marker of an ADPCM sound effect points to a frame header, then the DSP may behave unpredictably (looping sound effects may never end, or sound effects may never start).

Revision History

2013/05/08 Automated cleanup pass.
2011/02/21 Initial version.


CONFIDENTIAL