Database of Challenging Musical Sounds for
Evaluation and Refinement of Pitch Estimators
Adrian
Freed, Tristan Jehan
Introduction
Speech researchers have made the most thorough study of the
performance of pitch estimation algorithms. A key to their work
is the evaluation of algorithm performance against standardized
databases of speech that have been "hand" analyzed. Such a
database does not exist for musical signals. As a result, pitch
estimation papers in the computer music community describe
algorithms evaluated using short sound examples often chosen to
show new work in the best light. It is thus impossible to predict
performance of published algorithms in real musical situations,
and difficult for researchers to identify fruitful areas for new
work. We describe a publicly available database of musical sound
files intended to redress these difficulties.
Musical Sound Database
Sounds in this database can be grouped into two important and
hitherto poorly represented categories:
- Complete musical phrases are used to evaluate the impact of
estimation errors in common and realistic musical
contexts.
- Challenging examples areused to identify particular points
of weakness from which an algorithm may suffer. Included are
sounds with: pitch synchronous and additive noise, room
ambiance, cross-talk from adjacent strings, ambiguous octaves,
inharmonicity, missing fundamentals, glissandi, vibrato and
trills.
Access
The database will be available in early 1998 at http://www.cnmat.berkeley.edu/Research/Pitch.
You will be able to submit your own files to this database by
filling in a form at the site. This form represents a contract
that establishes you as the owner of the rights to the submitted
files and granting permission for their analysis and
re-distribution.
AIFF is the chosen format for sound file samples and SDIF for analyses of
these files. The SDIF pitch frame type allows for a weighted set
of pitches facilitating virtual pitches for inharmonic sounds and
management of multiple pitch estimates.
Database Overview
Wind
- Shakuhachi
- Organ Flu Pipes
- Suling
- Didjereedo and Stick
- Clarinet
- Bass Clarinet
Singing
- Indian
- Bel Canto
- Western Popular
- Tibetan
String
For these string sounds a wide range of playing techniques
were used including: open strings, low and high stopped, low and
high frequency vibrato, narrow and wide trill, timbre change, sol
ponticello, glissandi, tremelo near and away from bridge,
pizzicato, pizzicato stopped, slow bow change,harmonics, damped
rmonics, hammer on and pull off's, picked, left and right hand
damping, slaps, bottleneck slide and pops.
Brass
Percussion
Analysis
In parallel with the archival activity assembling this
database, we are exploring automatic segmentation and parameter
estimation tools to develop analyses of the sounds against which
algorithms may be judged. Early results using a wavelet
technique are very promising. The wavelet method identifies
each pitch period and provides a "voiced/unvoiced" estimate.
Combining this with energy-based techniques results in good
estimations for pitched regions of a phrase. The estimator is
robust with impulsive and continuous noise.
Future Work
- A set of artificially synthesized test signals
- Psychoacoustic experimental harness to develop perceptually
solid pitch estimates
- Objective measures of pitch estimate accuracy.
Acknowledgement
This work assembles materials developed over many years of
work with support from:
- California State Dept. of Commerce
- Zeta Music Inc.
- Gibson Guitar Inc.
- Silicon Graphics Inc.
- Apple Computer Inc.