Synthesizer Overview

The AX synthesizer application (SYN) accepts MIDI input and applies it to a wavetable to produce music. This application relies on the AX and MIX APIs.

Functional relationship of SYN to MIX and AX

For detailed API descriptions, refer to "Synthesizer" under "AX Applications" in the Audio section of the Cafe Reference Manual (HTML).


A wavetable is a lookup table used by the synthesizer to determine playback parameters for MIDI events. Specifically, it contains tables for instrument programs, regions, articulators, and samples. The following sections offer generic wavetable concepts; application methods for this specific synthesizer is covered in "Synthesizer application notes".

Wavetable concepts

Instrument programs

Instrument programs specify which wavetable objects correspond to MIDI program numbers. Typically, each of 16 MIDI channels will be programmed by MIDI events to access a particular instrument. Conceptually, there are two types of instrument programs: melodic and percussive.

Instrument programs

Melodic instruments

Melodic instruments contain n key regions to handle incoming MIDI key-on events for a specific key. Typically, the regions are used to match PCM samples for the intended key pitch for a particular instrument. Each contain descriptors for lowest and highest key for that instrument region, and it is up to the synthesizer to locate the correct region for the key. Each region also contains pointers to articulation and sample.

Melodic key regions

Percussive instruments

Unlike melodic instruments, percussive instruments contain a region for each key. Percussive instruments typically play a unique sample for each key. Each region also contains pointers to articulation and sample. In addition to data (as found in melodic instrument regions), percussive instrument regions also specify a "key group." A "key group" identifies groups of keys that should be played in only one instance at any given time. For example, only one sample between the closed, open, and pedal hi-hat keys should be heard at any given time; these keys belong to the same "key group."

Percussive key regions


A region specifies the method of playback for a given key of a given instrument. Specifically, these methods are:

Region for key number


Articulators define the playback characteristics for a note. Typically, an articulator describes the following items:

Articulation for region


Sample data is used to store information about the particular sound sample. Types of sample data include:

Samples for region

File format

In this section, we cover the specific file format for the AX synthesizer application. Each instance of the synthesizer uses a WT and PCM file. These files are created by a program that accepts DLS 1.0 files.

File creation

Wavetable data is created for the synthesizer using the program DLS1WT.exe. DLS1WT.exe accepts DLS 1.0 files for input. These files may be created by third party DLS editing and preview tools. DLS1WT.exe outputs a WT file that contains wavetable data and a PCM file that contains samples.

WT file

The WT file contains optimized wavetable lookup data to be used by the synthesizer application at runtime. The format of this file is as described below. C definitions for these data objects can be found in wt.h.

File header

typedef struct WTFILEHEADER

    u32 offsetPercussiveInst;
    u32 offsetMelodicInst;
    u32 offsetRegions;
    u32 offsetArticulations;
    u32 offsetSamples;
    u32 offsetLoopContext;

    // data ...


The file header contains file offsets to data structures, detailed in the table below:

Offset Description
u32 offsetPercussiveInst Byte offset to an array of 128 WTINST data structures. These instruments are intended for percussive programs on MIDI channel 10.
u32 offsetMelodicInst Byte offset to an array of 128 WTINST structures. These instruments are intended for melodic programs on all MIDI channels except channel 10.
u32 offsetRegions Byte offset to first WTREGION structure in an array of unknown size. These are regions indexed by the keyRegion members found in WTINST.
u32 offsetArticulations Byte offset to first WTART structure in an array of unknown size. These are articulators indexed by WTREGION.
u32 offsetSamples Byte offset to first WTSAMPLE structure in an array of unknown size. These are sample data indexed by WTREGION.
u32 offsetLoopContext Intended to specify byte offset to loop context data for ADPCM-compressed samples. (Not used at this time.)

Melodic and percussive instruments

typedef struct WTINST

    u16 keyRegion[128];


The WT file contains an array of 128 WTINST structures that store the key region index for 128 melodic instruments or 128 percussive instruments of the corresponding MIDI program number. This file format supports up to 65,534 region descriptors. 0xFFFF is used to specify no region for that note. u16 keyRegion[128] is the region index for all 128 keys of the given instrument.


typedef struct WTREGION

    u8  unityNote;
    u8  keyGroup;
    s16 fineTune;
    s32 attn;
    u32 loopStart;
    u32 loopLength;
    u32 articulationIndex;  // articulation index to reference
    u32 sampleIndex;        // sample index to reference


The WT file contains an array of WTREGION structures of unknown size. Each region is referenced by one or more corresponding key(s) of an instrument. The table below describes the different regions:

Region Description
u8 unityNote The key number that yields no change in pitch from original sample frequency.
u8 keyGroup Member of single key instance key group. Range is 1-15, 0 equals none.
s16 fineTune Fine tuning for this note in cents, 1 = 1cent.
s32 attn Base attenuation for this note, 0x00010000 = 0.1dB.
u32 loopStart Sample count for loop start.
u32 loopLength Total number of samples for loop, including loop points.
u32 articulationIndex Index to articulation for this region in WT file.
u32 sampleIndex Index to sample data for this region in WT file.


typedef struct WTART
    // LFO
    s32 lfoFreq;
    s32 lfoDelay;
    s32 lfoAtten;
    s32 lfoPitch;
    s32 lfoMod2Atten;
    s32 lfoMod2Pitch;

    // EG1
    s32 eg1Attack;
    s32 eg1Decay;
    s32 eg1Sustain;
    s32 eg1Release;
    s32 eg1Vel2Attack;
    s32 eg1Key2Decay;

    // EG2
    s32 eg2Attack;
    s32 eg2Decay;
    s32 eg2Sustain;
    s32 eg2Release;
    s32 eg2Vel2Attack;
    s32 eg2Key2Decay;
    s32 eg2Pitch;

    // pan
    s32 pan;


The WT file contains an array of WTART structures of unknown size. Each articulator is referenced by one or more region(s) of an instrument. Each articulator specifies the following:

Articulator Description
s32 lfoFreq LFO frequency, expressed in delta per 5-millisecond audio frame for a 64-step sine wave. 0x00010000 = 1.
s32 lfoDelay LFO start delay, expressed as the number of 5-millisecond audio frames prior to starting the LFO.
s32 lfoAtten LFO attenuation, at max volume (1.0). 0x00010000 = 0.1dB.
s32 lfoPitch LFO pitch, in cents for LFO at max pitch (1.0). 0x00010000 = 1cent.
s32 lfoMod2Atten LFO attenuation to apply to modulation wheel. AttenuationdB = lfoMod2Atten * (modulation wheel / 128). 0x00010000 = 0.1dB.
s32 lfoMod2Pitch LFO pitch to apply to modulation wheel. PitchCents = lfoMod2Pitch * (modulation wheel / 128). 0x00010000 = 1cent.
s32 eg1Attack Scale for ADSR volume envelope attack time. TimeSec = pow(2, scale / 1200 * 0x00010000).
s32 eg1Decay Scale for ADSR volume envelope decay time.
s32 eg1Sustain Sustain volume for ADSR volume envelope, in decibels. 0x00010000 = 0.1dB.
s32 eg1Release Decibel change per 5-millisecond audio frame for ADSR volume envelope release stage. 0x00010000 = 0.1dB.
s32 eg1Vel2Attack Scale to add to ADSR volume envelope attack time. Scale = eg1Vel2Attack * (key velocity / 128).
s32 eg1Key2Decay Scale to add to ADSR volume envelope decay time. Scale = eg1Key2Decay * (key number / 128).
s32 eg2Attack Same as eg1Attack for ADSR pitch envelope.
s32 eg2Decay Same as eg1Decay for ADSR pitch envelope.
s32 eg2Sustain Pitch to sustain for ADSR pitch envelope. 0x00010000 = 1cent.
s32 eg2Release Cent change per 5-millisecond audio frame for ADSR pitch envelope. 0x00010000 = 1cent.
s32 eg2Vel2Attack Same as eg1Vel3Attack for ADSR pitch envelope.
s32 eg2Key2Decay Same as eg1Key2Decay for ADSR pitch envelope.
s32 eg2Pitch Relative pitch at max for ADSR pitch envelope. 0x00010000 = 1cent.
s32 pan Absolute panning for percussive instruments. 0 = left, 64 = center, 127 = right.


typedef struct WTSAMPLE
    u16 format;     // ADPCM, PCM16, PCM8
    u16 sampleRate; // Hz
    u32 offset;     // offset in samples from beginning of PCM file
    u32 length;     // length of sample in samples
    u16 adpcmIndex; // ADPCM index to reference if in ADPCM mode


The WT file contains an array of WTSAMPLE structures of unknown size. Each set of sample data is referenced by one or more instrument region(s). The table below describes the data members:

Sample Description
u16 format Format for samples. Supporting ADPCM (WT_FORMAT_ADPCM), 16-bit PCM (WT_FORMAT_PCM16) and 8bit PCM(WT_FORMAT_PCM8).
u16 sampleRate Base sample rate for specified sample, in hertz.
u32 offset Offset to sample, in samples, from beginning of PCM file.
u32 length Length of sample, in samples.
u16 adpcmIndex Index of ADPCM decoding information array.

PCM file

The PCM file contains all the samples used by the WT file. All sample offsets, formats, loop points, lengths, and sample rates are stored in the WT file, which is loaded into memory by the user application. The synthesizer instance, with the memory offset in bytes, is then initialized.

Synthesizer application notes

MIDI bank support

The synthesizer application ignores bank MSB and LSB for program changes, MSB and LSB is always assumed to be 0. Instruments on nonzero MSB and LSB banks are not supported by either the dls1wt.exe or SYN.

MIDI event support

The synthesizer application (SYN) supports MIDI note off, note on, program change, pitch wheel, and some Controller events. SYN does not support polyphonic after-touch, channel after-touch, and system real-time events.

MIDI controller support

SYN will store all MIDI controller settings set by calling SYNMidiInput. The following controller settings are supported:

Controller Setting Purpose
0x01 Modulation wheel
0x06 & 0x26 Data entry (for pitch wheel range setting)
0x07 Channel volume
0x0A Pan
0x0B Expression
0x40 Hold pedal.
0x5B Effects, 1 depth (reverb).
0x5C Effects, 2 depth (chorus).
0x64 & 0x64 Registered parameter number (for pitch wheel range).
0x78 All sound off.
0x7B – 0x7F For all notes off only.

Calling SYNMidiInput

User applications must call SYNMidiInput from within their AX User Frame Callback, which is also where they should call SYNRunAudioFrame. MIDI input events are buffered by the synthesizer and execute during SYNRunAudioFrame, ensuring that the ADSR envelope runs in correct sequence after a voice starts for a note.

Shutting down a synthesizer

Prior to calling SYNQuitSynth, the user application should call SYNGetActiveNotes to ensure that no other notes are still running. In general, it takes additional time for voices to complete the ADSR release phase after the MIDI note off event; during this time the note is still active.

Revision History

2013/05/08 Automated cleanup pass.
2011/02/21 Initial version.