Volumetric Modeling of Acoustic Fields in CNMAT's Sound
Spatialization Theatre
Sami Khoury, Adrian Freed, David Wessel
Center for New Music and Audio Technologies (CNMAT)
Abstract
A new tool for real-time visualization of acoustic sound
fields has been developed for a new sound spatialization theatre.
The theatre is described and several applications of the acoustic
and volumetric modeling software are presented.
Keywords J.5 Arts and Humanities: Performing Arts
Additional Keywords Acoustic Modeling, Real-time
Visualization
-
Sound Spatialization Theatre
The Center for New Music and Audio
Technologies, CNMAT, is an interdisciplinary research center
at the University of California at Berkeley. Our sound
spatialization theatre is built into the main performance and
lecture space at CNMAT's facility. A unique feature of the
theatre is a flexible suspension system built primarily for
loudspeakers. Each speaker hangs from a rotating beam. The
pivot point for each speaker runs in a track that slides
along rails bolted to the ceiling. With height adjustment of
each suspension cable, this system safely allows speakers to
be moved anywhere in the room and oriented along two of the
three possible axes.

Rotational symmetry of the concentric drivers
in Meyer HM-1 speakers obviates the need for adjustments
around the third or "roll" axis. Real-time, low-latency audio
signal processing for the speaker array is performed on a
multiprocessor Silicon Graphics Octane workstation. This
machine was chosen because of its built-in multi-channel
audio, reliable real-time performance, and the availability
of sound synthesis and processing software .
Most current applications of spatial audio are
based on a model where source material is spatially encoded
for an ideal room with a predetermined speaker geometry . The
result is often unsatisfactory because of the difficulty in
adapting real rooms to the ideal. We are working on a more
general model where the source material may be from
instruments and performers in the room, and therefore
real-time spatial processing is required for all sources.
-
The Problem
Optimizing the speaker array positioning and
sound processing for each performance in the theatre is
challenging. The traditional empirical approach is far too
time-consuming to support situations in which there are
weekly (and sometimes daily) performances with varied
configurations. The problem with the trial and error approach
is the difficulty of evaluating the effects of new speaker
positions and software parameter changes for all listening
positions. It is easy to optimize the listening experience
for the lucky person in the "sweet spot" at the expense of
the rest of the audience. The challenge is to find a
compromise where as many listeners as possible experience the
intent of the sound designer and as few listeners as possible
endure disastrous seats.
- A Solution
To aid sound designers and composers in
achieving a good compromise for the diverse applications of
the theatre, we have developed software for visualizing
source signals, a model of the acoustic sound field in the
room, and interpretations of the field according to
perceptual models. Important examples of prior work in this
area include and . Unique features of the work described here
include the emphasis on interactive, real-time visualization,
the use of a highly configurable performance space, and the
focus on adapting the processing and space to achieve diverse
artistic goals.
The visualization software is part of a
complete system managing audio, gestural flow and visual
display. The heart of the system is a database describing the
room. It contains information on geometric features such as
the shape of the room, positioning and orientation of
sources, microphones and audience seating, live performer
location and their musical instrument's location. Acoustic
properties of each object in the room include: frequency
dependent radiation patterns and the location of their
acoustic "centers".
This database is used by the spatial sound processing
software to process source signals to create an audience
percept of virtual sources from arbitrary regions in space.
The desired percept may also involve creating the illusion
that listeners are in a room of a different size than the
actual theatre . The location of these sources is controlled
in real-time through gestures or arbitrary control messages
arriving from the network .
The visualization software has access to the
room database and real-time parameter estimates from the
spatialization software. Since it has no access to the real
sound pressure levels in the room it must estimate these
based on an acoustic model of the room. The image source
method was used because of its amenability to real-time
computation.
-
Application Examples
-
Pressure Levels
Volumetric visualization of the time
varying sound pressure level in CNMAT's sound
spatialization theatre is illustrated in Figure 1. The
sound sources in this case are organ pipes. Pressure is
shown using a color map on horizontal cut planes through
the space. These movable planes are typically set to the
average positions of audience's and performer's ears.
Multiple simultaneous cut surfaces may be necessary, for
example, for balcony seating in large theatres.
It is interesting to contrast this
volumetric visualization with traditional audio metering
where scalar signal levels are displayed for various
nodes in the signal processing chain. Such metering is
useful for managing signal levels in the electrical
elements of the audio system to avoid distortion and
speaker overload. However it is hard even for experienced
sound engineers to use scalar metering to predict actual
sound pressure levels in many locations in a venue.
-
Summing Localization
The summing localization model , known in
its general form as vector panning is a commonly adopted
strategy for sound localization with speaker arrays. With
this technique a virtual source can be placed between a
pair (or triple in the 3-D case) of speakers by dosing
the level of the source signal appropriately to each pair
or triple . A vector field display is useful to indicate
perceived direction of a virtual sound source:

Our initial experience with this method was
good. Speakers were placed at equal distances from the
center of the room and panning worked smoothly around the
room. When the speakers were moved to more practicable
locations, above the audience and closer to the walls and
corners of the room, vector panning failed to provide
good virtual source imaging. This may be explained by the
precedence effect that may work against summation
localization. As the difference in the time of arrival of
wavefronts from the two speakers approaches one
millisecond, the source of the earliest wavefront is
perceived as the actual source, regardless of the
amplitude dosing performed by vector-panning. By
visualization of an isosurface along which wavefront time
difference is a constant we can illustrate the geometric
implications of this perceptual phenomenon (Figure 3).
This isosurface representation is also used to view other
important time delay effects in spatial hearing such as
the varied values of the echo threshold, backward
masking, and multiple event thresholds . We are
experimenting with techniques to minimize the effects of
the precedence effect including, source decorrelation ,
and the introduction of appropriate delays into the feeds
of the speakers in the room so that they all have the
same effective distance from a chosen "center" of the
listening space.
- Interference
Acoustic models have to take phase from
coherent sources of sound into account. Figure 3 shows the
sound pressure level of a sine tone at a particular frequency
in the room. At low frequencies destructive and constructive
interference create markedly different sound levels around
the room.
When a loudspeaker is placed close to a hard
wall, reflected waves interfere with the direct source,
distorting the frequency and phase response of the
loudspeaker. These effects are modeled by introducing the
reflections as further sources, as illustrated by the smaller
speakers outside the room in Figure 4.
-
Implementation
-
Visualization and User Interaction
The high level programming tool that binds
the spatial visualization system together is Tcl/Tk , a
scripting language and graphical user interface
toolkit.
-
Data Visualization
The Visualization Toolkit (VTK) , a C++
class library for visualizing data, provides a set of
bindings to the Tcl language that allow access to all
the classes in the system. Through this scriptable
interface, it is possible to create visualizations
which change interactively and dynamically in
response to user input data. 
A VTK data object is maintained
internally for each sample region in the listening
space. This data object is synchronized with the
sample points in the region.
- User Interaction
Interactive tasks such as moving and
orientating sound sources take advantage of
user-interface event bindings in Tk. Real-time operation
based on monitoring signals being supplied to the sound
sources is achieved by a Tcl thread that repeatedly
requests current energy estimates, computes the acoustic
model and visualization of that model, and renders the
scene.
-
Geometric Acoustic Modeling
Three kinds of objects are modeled: active
sound sources for loudspeakers and musical instruments,
passive reflective objects for walls and diffusers, and
finally listening points.
-
Listening space
Listening space geometry is represented
using a set of polygonal faces corresponding to
walls, ceilings, floor etc. Each face is decorated
with information describing its sound properties,
such as frequency dependent reflection coefficients.
Small room models can be easily described
numerically. More complicated models may be imported
from a specialized 3D modeling package. Maya is
interesting in this respect because it supports
storage of arbitrary data (i.e. acoustic) in nodes of
its scene graph.
-
Listening points
Listening points are represented as
two-dimensionally sampled bounded surfaces in
three-dimensional space. This representation allows
for fine sampling at important locations without the
computational load that would be required for
complete volumetric models. Common surfaces used
include cut planes corresponding to ear level of
seated listeners and performers; and meshes of planes
for tiered seating.
- Sound Sources
Sources are represented using as polygonal
models. Each model is decorated with information
describing its acoustic properties such as its acoustic
center location and frequency-dependent directivity.
-
Image Source Modeling
In this acoustic modeling technique,
information is computed for each listening point in turn.
Conceptually, lines are projected from the listening
point to the acoustic center of each source and to the
acoustic center of reflections of each source from the
passive reflecting objects in the space. The lengths of
these lines are used to estimate energy reaching the
listening point. The solid angles of each line are used
to compute the effect of source directivity, and energy
loss as a function of frequency and angle of
incidence.
- Model Computation
Once the geometric implications of the relative
positions of sources, listening points, and reflecting
objects are calculated, the actual acoustic modeling
calculation can be performed. The simplest computation uses
sine wave probe tones directly calculating and summing
vectors for the phase and amplitude of wave fronts arriving
at each listening point. For real-time modeling an
optimization is required. We use energy estimates of adjacent
frequency bands averaged at the visual display rate, to avoid
the expense of a sequence of convolutions at the full audio
sample rate. This method allows for plausible approximations
of energy, although pathological locations where
cancellations may occur would not be accurately
displayed.
-
Conclusion and Future Work
The visualization system described here is a
valuable tool for spatial sound researchers. sound engineers
and composers using CNMAT's sound spatialization theatre.
Further work is in progress on the adaptation of better
acoustic simulation methods for more accurate display of the
quality of the reverberant field. The room database will be
automatically extracted from a model built with 3D modeling
software . Volume visualization strategies are being explored
to display sounds in spectral and impulse response form.
- Sponsors
We gratefully acknowledge support from:
- Alias/Wavefront
- Edmond Campion
- Edmund O'Neill foundation
- Gibson Guitar
- LCS
- Meyer Sound
- Silicon Graphics
-
Acknowledgement
Richard Andrews, Tom Johnson, Tibor Knowles and
Matt Wright developed the speaker mounting and audio patching
system for the theatre. René Caussé, Jean-Marc
Jot and John Meyer provided essential insights and data on
room and loudspeaker acoustics. Amar Chaudhary provided VTK
expertise and developed the organ pipe geometric models.
- References
[1] A. Freed, "Codevelopment of user interface, control and
digital signal processing with the HTM environment," presented at
5th International Conference on Signal Processing Applications
and Technology, Dallas, TX, USA, 1994.
[2] A. Freed, X. Rodet, and P. Depalle, "Synthesis and control
of hundreds of sinusoidal partials on a desktop computer without
custom hardware," presented at Fourth International Conference on
Signal Processing Applications and Technology ICSPAT '93, Santa
Clara, CA, USA, 1993.
[3] A. Freed, "Real-Time Inverse Transform Additive Synthesis
for Additive and Pitch Synchronous Noise and Sound
Spatialization," presented at AES 104th Convention, San
Francisco, CA, 1998.
[4] M. Bosi and S. E. Forshay, "High quality audio coding for
HDTV: an overview of AC-3," presented at International Workshop
on HDTV '94, Turin, Italy, 1994.
[5] M. Feibus, "Microsoft's DirectSound," Windows
Sources, pp. 203(3), 1996.
[6] A. Stettner and D. P. Greenberg, "Computer graphics
visualization for acoustic simulation," presented at Conference
Proceedings, Boston, MA, USA, 1989.
[7] M. Monks, B. M. Oh, and J. Dorsey, "Acoustic Simulation
and Visualization using a New Unified Beam Tracing and Image
Source Approach," presented at Convention of the Audio
Engineering Society (1996), 1996.
[8] H. Lehnert and J. Blauert, "Virtual auditory environment,"
presented at Fifth International Conference on Advanced Robotics.
Robots in Unstructured Environments (Cat. No.91TH0376-4), Pisa,
Italy, 1991.
[9] C. Hand, "A survey of 3D interaction techniques,"
Computer Graphics Forum, vol. 16, pp. 269-81, 1997.
[10] M. Wright and A. Freed, "Open Sound Control: A New
Protocol for Communicating with Sound Synthesizers," presented at
International Computer Music Conference, Thessaloniki, Greece,
1997.
[11] L. Heewon and L. Byung-Ho, "An efficient algorithm for
the image model technique," Applied Acoustics, vol. 24,
pp. 87-115, 1988.
[12] J. Blauert, Spatial hearing : the psychophysics of
human sound localization. Cambridge: MIT Press, 1997.
[13] V. Pulkki, "Virtual sound source positioning using vector
base amplitude panning," Journal of the Audio Engineering
Society, vol. 45, pp. 456-66, 1997.
[14] J. M. Chowning, "The simulation of moving sound sources,"
presented at Audio Engineering Society 39th Convention, New York,
NY, USA, 1970.
[15] G. S. Kendall, "The decorrelation of audio signals and
its impact on spatial imagery," Computer Music Journal,
vol. 19, pp. 71-87, 1995.
[16] J. K. Ousterhout, Tcl and the Tk toolkit. Reading,
Mass.: Addison-Wesley, 1994.
[17] W. J. Schroeder, K. M. Martin, and W. E. Lorensen, "The
design and implementation of an object-oriented toolkit for 3D
graphics and visualization," , 1996.
[18] Alias/WaveFront, "Maya 1.0,". Toronto, Canada:
Alias/WaveFront, 1998.
