Volumetric Modeling of Acoustic Fields in CNMAT's
Sound Spatialization Theatre
Sami Khoury,
Adrian Freed,
David Wessel
CNMAT, UC Berkeley, 1750 Arch Street, Berkeley, CA 94709
(510) 643 9990 x 308 adrian [at] cnmat [dot] berkeley [dot] edu
Abstract
A new tool for real-time visualization of acoustic
sound fields has been developed for a new sound spatialization
theatre. Unique features of the theatre and the acoustic and
volumetric modeling software are described.
Sound Spatialization Theatre
The Center for New Music and Audio Technologies,
CNMAT, is an interdisciplinary research center at the University
of California at Berkeley. Our sound spatialization theatre is
built into the main performance and lecture space at CNMAT's
facility. A unique feature of the theatre is a flexible
suspension system built primarily for loudspeakers. Each speaker
hangs from a rotating beam. The pivot point for each speaker runs
in a track that slides along rails bolted to the ceiling. With
height adjustment of each suspension cable, this system safely
allows speakers to be moved anywhere in the room and oriented
along two of the three possible axes (Figure 1). Rotational
symmetry of the concentric drivers in Meyer HM-1 speakers
obviates the need for adjustments around the third or "roll"
axis. Real-time, low-latency audio signal processing for the
speaker array is performed on a multiprocessor Silicon Graphics
Octane workstation. This machine was chosen because of its
built-in multi-channel audio, reliable real-time performance, and
the availability of sound synthesis and processing software .
Most current applications of spatial audio are
based on a model where source material is spatially encoded for
an ideal room with a predetermined speaker geometry . The result
is often unsatisfactory because of the difficulty in adapting
real rooms to the ideal. Our research is based on a more general
model where the source material may be from instruments and
performers in the room, and therefore real-time spatial
processing is required for all sources.
The Problem
Optimizing the speaker array positioning and sound
processing for each performance in the theatre is challenging.
The traditional empirical approach is far too time-consuming to
support situations in which there are weekly (and sometimes
daily) performances with varied configurations. The problem with
the trial and error approach is the difficulty of evaluating the
effects of new speaker positions and software parameter changes
for all listening positions. It is easy to optimize the listening
experience for the lucky person in the "sweet spot" at the
expense of the rest of the audience. The challenge is to find a
compromise where as many listeners as possible experience the
intent of the sound designer and as few listeners as possible
endure disastrous seats.
A Solution
To aid sound designers and composers in achieving a
good compromise for the diverse applications of the theatre, we
have developed software for visualizing speaker signals, a model
of the acoustic sound field in the room, and interpretations of
the field according to perceptual models. Important examples of
prior work in this area include and . Unique features of the work
described here include the emphasis on interactive, real-time
visualization, the use of a highly configurable performance
space, and the focus on adapting the processing and space to
achieve diverse artistic goals.
The visualization software is part of a complete
system managing audio, gestural flow and visual display (Figure
2). The heart of the system is a database describing the room. It
contains information on geometric features such as the shape of
the room, positioning and orientation of the speakers,
microphones and audience seating, live performer location and
their musical instrument's location. Acoustic properties of each
object in the room include: frequency dependent radiation
patterns and the location of their acoustic "centers".
This database is used by the spatial sound
processing software to process source signals to create an
audience percept of virtual sources from arbitrary regions in
space. The desired percept may also involve creating the illusion
that listeners are in a room of a different size than the actual
theatre . The location of these sources is controlled in
real-time through gestures or arbitrary control messages arriving
from the network .
The visualization software has access to the room
database and real-time parameter estimates from the
spatialization software. Since it has no access to the real sound
pressure levels in the room it must estimate these based on an
acoustic model of the room. The image source method was used
because of its amenability to real-time computation.
Application Examples
Pressure Levels
Volumetric visualization of the
time varying sound pressure level in CNMAT's sound spatialization
theatre is illustrated in Figure 3. The reader is advised to
explore the color images available at http://www.cnmat.berkeley.edu/~khoury
for a better indication of the program's
potential than can be communicated with the monochrome
reproduction of this preprint. Pressure is shown using a color
map on a horizontal cut plane through the space. This movable
plane is typically set to the average positions of audience's
ears in the room. This surface may be changed to show, for
example, effects of tiered seating or to evaluate the experience
of a performer who may be standing on a raised stage. Several
simultaneous cut surfaces may be necessary, for example, for
balcony seating in large theatres.
It is interesting to contrast this volumetric
visualization with traditional audio metering where scalar signal
levels are displayed for various nodes in the signal processing
chain. Such metering is useful for managing signal levels in the
electrical elements of the audio system to avoid distortion and
speaker overload. However it is hard even for experienced sound
engineers to use scalar metering to predict actual sound pressure
levels in many locations in a venue.
Summing Localization
Figure 4 is a vector field display of the perceived
direction of a virtual sound source according to a commonly
adopted strategy for sound localization with speaker arrays, the
summing localization model , known in its general form as vector
panning . The idea of the technique is that a virtual source can
be placed between a pair (or triple in the 3-D case) of speakers
by dosing the level of the source signal appropriately to each
pair or triple .
Our initial experience with this method was good.
Speakers were placed at equal distances from the center of the
room and panning worked smoothly around the room. When the
speakers were moved to more practicable locations, above the
audience and closer to the walls and corners of the room, vector
panning failed to provide good virtual source imaging. This may
be explained by the precedence effect which may work against
summation localization. As the difference in the time of arrival
of wavefronts from the two speakers approaches one millisecond,
the source of the earliest wavefront is perceived as the actual
source, regardless of the amplitude dosing performed by
vector-panning. Visualization of an isosurface along which
wavefront time difference is a constant illustrates the geometric
implications of this perceptual phenomenon (Figure 5). This
isosurface representation is also used to view other important
time delay effects in spatial hearing such as the varied values
of the echo threshold, backward masking, and multiple event
thresholds . The impact of the precedence effect can be
controlled by introducing an appropriate delay into the feeds of
the speakers in the room so that they all have the same effective
distance from a chosen "center" of the listening space. The
desired center about which audio spatialization is performed is
often different from the geometric center of the listening space.
This discrepancy arises because the performance area generally
occupies the front third of a theater, with the audience seated
in the rearward two-thirds.
Interference
Acoustic models have to take into account phase
from coherent sources of sound. Figure 3 shows the sound pressure
level of a sine tone at a particular frequency in the room. At
low frequencies destructive and constructive interference create
markedly different sound levels at varying room locations.
When a loudspeaker is placed close to a hard wall,
reflected waves interfere with the direct source, distorting the
frequency and phase response of the loudspeaker. These effects
are modeled by introducing the reflections as further sources, as
illustrated by the smaller speakers outside the room in Figure
6.
Implementation
Visualization and User Interaction
The high level programming tool that binds the
spatial visualization system together is Tcl/Tk , a scripting
language and graphical user interface toolkit (Figure 7).
Data Visualization
The Visualization Toolkit (VTK) , a C++ class
library for visualizing data, provides a set of bindings to the
Tcl language that allow access to all the classes in the system.
Through this scriptable interface, it is possible to create
visualizations which change interactively and dynamically in
response to user input data. A VTK data object is maintained
internally for each sample region in the listening space. This
data object is synchronized with the sample points in the
region.
User Interaction
Interactive tasks such as moving and orientating
sound sources take advantage of user-interface event bindings in
Tk. Real-time operation based on monitoring signals being
supplied to the sound sources is achieved by a Tcl thread that
repeatedly requests current energy estimates, computes the
acoustic model and visualization of that model, and renders the
scene.
Geometric Acoustic Modeling
Three kinds of objects are modeled: active sound
sources for loudspeakers and musical instruments, passive
reflective objects for walls and diffusers, and finally listening
points.
Listening space
Listening space geometry is represented using a set
of polygonal faces corresponding to walls, ceilings, floor etc.
Each face is decorated with information describing its sound
properties, such as frequency dependent reflection coefficients.
Small room models can be easily described numerically. More
complicated models may be imported from a specialized 3D modeling
package. Maya is interesting in this respect because it supports
storage of arbitrary data (i.e. acoustic) in nodes of its scene
graph.
Listening points
Listening points are represented as
two-dimensionally sampled bounded surfaces in three dimensional
space. This representation allows for fine sampling at important
locations without the computational load that would be required
for complete volumetric models. Common surfaces used include cut
planes corresponding to ear level of seated listeners and
performers; and meshes of planes for tiered seating.
Sound Sources
Sources are represented using as polygonal models.
Each model is decorated with information describing its acoustic
properties such as its acoustic center location and
frequency-dependent directivity.
Image Source Modeling
In this acoustic modeling technique, information is
computed for each listening point in turn. Conceptually, lines
are projected from the listening point to the acoustic center of
each source and to the acoustic center of reflections of each
source from the passive reflecting objects in the space. The
lengths of these lines are used to estimate energy reaching the
listening point. The solid angles of each line are used to
compute the effect of source directivity, and energy loss as a
function of frequency and angle of incidence.
This method leads to the following expression for
the space-complexity of modeling up to the n'th order
reflections of a listening space with f faces of defining
geometry and s direct sound sources.
Note that when calculating the next successive
order of reflections, a source r which was created by
reflecting across a face g culls away face g for
the next iteration. This must be true for the following reason:
g's surface normal must have been facing s in order
for the reflection to have been performed. So for this new
virtual source r, it must be the case that g's
surface normal now points away from s and is discarded
from consideration for reflections. This explains the (f-1) term
above. However, in more complex models, particularly those that
possess non-convex geometry, more than just the previous face
will be culled for a given reflected source. Thus, the expression
above is an upper bound on the maximum number of sources that
could possibly be generated when calculating reflections.
Source Bounding
Separate computation and display of direct and
reverberant energy is simple to achieve with image source models
by introducing upper and lower "reflection limits". The upper
reflection limit terminates the reflection-generation process --
essentially halting the recursive process by which source
reflections are generated. The display software maintains an
active source list. consisting of all the sources, both
direct and virtual, which exist between the upper and lower
reflection limits, inclusively (Figure 8). A lower reflection
limit of zero indicates that the direct sources should be
included in the active source list. Setting the upper and lower
reflection limits equal to one another allows the acoustic power
from a single order of reflections to be modeled.
Model Computation
Once the geometric implications of the relative
positions of sources, listening points, and reflecting objects
are calculated, the actual acoustic modeling calculation can be
performed. The simplest computation uses sine wave probe tones
directly calculating and vector summing the phase and amplitude
of wave fronts arriving at each listening point. For real-time
modeling an optimization is required. The idea is to avoid the
expense of the sequence of convolutions that would be required at
the full audio sample rate by using energy estimates of frequency
bands averaged at the visual display rate. This method allows for
plausible approximations of energy, although pathological
locations where cancellations may occur would not be accurately
displayed.
Conclusion and Future Work
The visualization system described here is a
valuable tool for spatial sound researchers. sound engineers and
composers using CNMAT's sound spatialization theatre. Further
work is in progress on the adaptation of better acoustic
simulation methods for more accurate display of the quality of
the reverberant field. The room database will be automatically
extracted from a model built with 3D modeling software . Volume
visualization strategies are being explored to display sounds in
spectral and impulse response form.
Sponsors
We gratefully acknowledge support from :
-
- Alias/Wavefront
-
- Edmond Campion
-
- Edmund O'Neill foundation
-
- Gibson Guitar
-
- LCS
-
- Meyer Sound
-
- Silicon Graphics
Acknowledgement
Richard Andrews, Tom Johnson, Tibor Knowles and
Matt Wright developed the speaker mounting and audio patching
system for the theatre. René Caussé, Jean-Marc Jot and
John Meyer provided essential insights and data on room and
loudspeaker acoustics.
References