Goals of SDIF

SDIF was designed with multiple goals in mind. Sometimes these goals
have been somewhat at odds with each other and we have been forced to be
clear about what is most important about the SDIF effort.

Interchange Format


  • Multi-platform
  • Standard data types: IEEE floats, Two's complement integers, ASCII,
    ...
  • Open, publicly available specification
  • Unambiguous semantics
  • Freely distributable C and C++ library and
    API (forthcoming)
  • Suite of freely distributable utilities

Comprehensive Support for Extant Sound Descriptions

The SDIF standard specifies ways to encode most of the sound descriptions
commonly in use today: time domain, STFT, spectral peaks, sinusoidal tracks,
fundamental frequency estimates, breakpoint functions, etc.

For standard sound descriptions, SDIF provides standard ways to encode
them. We want to avoid the situation where two separate programs that both
produce the same kind of data generate incompatible SDIF files.

This goal is balanced with the goal of flexibility, allowing SDIF files
to contain more experimental (or even proprietary) sound descriptions.

Reduce duplication of effort for everybody to support everybody else's
extant data formats

There are a large number of analysis/resynthesis packages available today,
but they each include their own incompatible formats for the data they produce.
For one group's synthesizer to be able to read files from other group's
analysis packages requires a certain amount of effort in parsing and interpreting
the files, and this effort must be repeated for each supported format.

We envision a situation where each group creates translation programs
to go from their own formats into SDIF and vice versa.

Encourage and facilitate the development of new tools for manipulating
sound in spectral and other domains

Promote the use of interesting sound descriptions in general.

Simplicity

One of the design principles of SDIF is "keep simple things simple".
SDIF should be straightforward and easy to understand, encouraging lots
of programmers to write useful and interesting programs that deal with SDIF
files. We want the specification of SDIF to be reasonably small and code
that manipulates SDIF to be easy to write and understand.

SDIF is based on existing standards like IEEE floats, "frames"
that are like IFF chunks.
etc.

Flexibility

SDIF is open-ended, allowing new sound descriptions to be encoded in
custom frame types. The general format for frames is general enough to represent
most kinds of sound representations, so we don't foresee any problems representing
new kinds of data in SDIF.

The collection of standard frame and matrix
types is extensible, so as new sound descriptions become more established
and common, frame types to represent them can be added to the SDIF
standard.

Aggregating Different Kinds of Data/Archive Format

SDIF allows for multiple descriptions of the same sound or of related
sounds to exist together in a single file or stream. This is meant to alleviate
the situation where running an analysis package (and re-running it with
different parameters) generates an explosion of files tied together only
by naming conventions.

However, to facilitate streaming (and for other reasons), all these data
must be interleaved in time in the resulting SDIF file or stream, so normal
file archive operations like adding or extracting an individual set of data
are not especially efficient.

Programs Can Ignore What They Don't Understand

For example, if a program that understands sinusoidal tracks processes
an SDIF file that contains sinusoidal track information plus some other
information that it doesn't understand, it should still be able to extract
all of the information about sinusoidal tracks to do its processing. Also,
it should be able to leave the rest of the information in the file completely
unchanged.

Streaming

Internet applications demand sound descriptions as "streams."
There is now a large commercial business streaming compressed time-domain
audio across the Internet; it should be possible to send other sound descriptions
as streams using the SDIF format.

As much as possible, the SDIF protocol is "stateless", meaning
that each section of the file stream can be interpreted by itself, without
needing too much context from earlier sections. This reduces memory requirements
of devices that receive (and render) SDIF streams and lowers the connection
overhead.

Speed of Access of Files


  • 64-bit alignment (facilitates reading, especially with memory-mapped
    files, on machines with 64-bit architectures)
  • Should be easy for a program to skip through the frames of an SDIF
    file.

Space Efficiency

We want SDIF files to be as small as reasonably possible. However, the
goal of space efficiency must be balanced with the goals of simplicity and
being an interchange format, so we avoid tricky clever encodings, compressions,
and optimizations.

CNMAT and IRCAM intend
to use SDIF not just as an interchange format, but as the main working file
format for the CAST, Additive,
Chant,
and other projects.

Ease of Writing an Editor

Often during the design of SDIF we considered potential features from
the point of view of trying to write a graphical editor for SDIF data. We
want it to be easy to write an editor that is fast and efficient without
having to be unduly complicated.