An Adaptive Conductor
Follower
Michael Lee, Guy Garnett, David Wessel
- An
Adaptive Conductor Follower
- Conducting
- Conductor
Follower and Musical Instrument Control
- Neural
Networks for Conductor Following
- Neural
Networks for Tempo Tracking
- Figure
1. System Architecture
- Figure
2. Musician ArchitectureTraining
- Conclusions
Introduction
Conducting an ensemble of musicians poses difficult
recognition and parameter estimation problems. Musicians must
extract information such as tempo and volume from the conductor's
gestures. The difficulty of this task increases if the musicians
are expected to adapt to markedly different conducting styles in
addition to dealing with variations within a given style. An
artificial musician integrated into a human ensemble requires the
same recognition, estimation, and adaptive capabilities as its
human counterparts.
We have developed an adaptive artificial musician that
addresses these problems using MAX (Puckette 1986), MAXNet (Lee
1991), a Buchla Lightning, and a Mattel Power Glove. Our musician
is able to understand gestures and control tempo, volume, and
other performance parameters. In this report we focus on adaptive
strategies for interpreting the gestures of a conductor or
conductor following. We discuss the control parameters of our
musician and the adaptive methods used to control these
parameters. We also discuss our training method and environment,
and evaluate the effectiveness of our adaptive follower.
Conducting
The conductor is responsible for controlling the shape or
interpretation of a piece. He has two separate contexts for
explaining his interpretation to an ensemble: rehearsal time and
performance time. During rehearsal, the ensemble alters its
performance of a piece until it matches the conductor's
interpretation; the primary goal is working out fine details such
as phrasing, balance, and tempo deformations. There is much
stopping and repetition of short segments to facilitate
learning.
A performance is just another rehearsal in most respects
except one: the conductor and ensemble endeavor to keep going
even if mistakes are made. Further refinements in interpretation
are made during performance as well as compensation for different
environments (such as a full hall instead of an empty rehearsal
studio), or ensemble personnel.
Conductor Follower and Musical
Instrument Control
We can model the task of an individual musician as three
subtasks: monitor the conductor's physical gestures and recognize
the implied control data (conductor following), combine these
control data with a score representation into a performance
representation (performance interpretation), and translate the
performance interpretation into musical instrument control
gestures (instrument control). This decomposition simplifies the
overall task and decouples performance interpretation and
instrument control from conductor following.
Neural Networks for Conductor
Following
Our architecture for simultaneously producing classification
and parameter estimate information is modular and can be divided
into three pieces: a classification module, a group of parameter
estimator modules, and glue to combine the classifier and
estimator outputs (see figure 1). Both the classifier and
estimator modules consist of a preprocessor section and a
feed-forward neural network. The preprocessors are designed to
inject any apriori knowledge or structure about the problem into
the system. All feed-forward networks are trained by
back-propagation (Rumelhart 1986) using a training set that
reflects the prior probability distribution of classes. For the
classifier, augmenting the cost function with the constraint that
all the output values sum to one results in a net which returns
the Bayesian posterior probabilities of class membership for a
given feature vector (Bourlard 1990).
The difference between the two classifier and estimator
modules is that estimators act as multidimensional function
approximators as opposed to strict classifiers. Note that the
feature vectors of the estimator modules may partially or
completely overlap or overlap with the feature vector of the
classifier.
Neural Networks for Tempo
Tracking
In our first tempo tracking model, we restricted the baton
movements to one dimension. We considered a simple conducting
style where the bottom of the beat indicated the actual downbeat.
Tempo was computed using the time between downbeats. This method
has a couple of serious disadvantages. First, the tempo
measurement completes too late because the beat has already
finished. Second, there is no way to vary the tempo within the
beat. To handle subtle tempo fluctuations within a beat, we need
a prediction mechanism and we need more resolution within a
beat.
We can get more resolution by calculating tempo from the time
between half beats. We have taken some experimental data and
found that the top of the beat curve does not necessarily occur
exactly halfway through the beat and varies widely from conductor
to conductor and tempo to tempo. The time measure should
therefore be adjusted for each conductor. This solution now gives
control every half beat but still computes it after the fact.
Prediction can be added to the system by using previous
velocity and acceleration measures to predict the time until the
next half beat. If the resolution is fine enough, we can get an
instantaneous measure of the tempo.

Figure 1. System
Architecture

Figure 2. Musician
ArchitectureTraining
There are two adaptive elements in our artificial musician:
the tempo tracker and the gesture classifier. To train the
tempo-tracker compensator, we measured the up and down half-beat
times for various tempos. This measurement was taken by asking
the user to conduct along with a metronome. Time measurements
were stored and split into a training and test set. Separate
feed-forward neural networks were then trained to approximate the
tempo in both directions. Training data for the classifier was
collected by asking the user to make hand gestures to
corresponding gesture classes.
Conclusions
We have used neural networks to address classification and
parameter estimation in a real-time conducting application. The
networks were able to adapt to the user's gestures resulting in a
simple, flexible conductor follower that responds to a variety of
different users. Because of their learning ability, neural
networks can help obtain subtle, complex, dynamic control
information from a wide variety of conductors.
References
Bourlard, H., Morgan, N., Wellekens, C.J., "Statistical
Inference in Multilayer Perceptrons and Hidden Markov Models with
Applications in Continuous Speech Recognition,"
Neurocomputing, Fogelman, F., Herault, J., eds., NATO ASI
Series, Vol. F68, 1990.
Lee, M., Freed, A., Wessel, D., "Real-Time Neural Network
Processing of Gestural and Acoustic Signals," Proc. of the Int.
Computer Music Conf., Montreal, 1991.
Lee, M., Freed, A., Wessel, D., "Neural Networks for
Simultaneous Classification and Parameter Estimation in Musical
Instrument Control," Proc. of SPIE Conf. on Adaptive and Learning
Systems, Orlando, 1992.
Puckett, M., "Interprocess Communication and Timing in
Real-time Computer Music Performance," Proc. of the Int. Computer
Music Conf., The Hague, 1986.
Rumelhart, D.E., McClelland, J.L. Parallel Distributed
Processing: Explorations in the Microstructure of Cognition,
Vols. 1 and 2, MIT Press, Cambridge, 1986.