What is Music Plus One?
The goals of my ``Music Plus One'' project are similar to the more
familiar ``Music Minus One'' (MMO). MMO makes a recording of a piece
of music for soloist and accompaniment, such as a sonata or concerto,
where only the accompaniment is actually recorded. The music is
prefaced by several warning clicks (something like Lawrence Welk's
``and a one and a two and a ...''), and the soloist tries to play
along with the recording. A heartfelt yet futile battle of wills
follows which eventually results in the live player's unconditional
surrender to the robotic insistence of the recording. Thus, contrary
to both musical etiquette and common sense, the soloist must follow
the accompaniment.
Although MMO has its heart in the right place, in practice it is the
antithesis of what the musical experience should be. My project,
Music Plus One (MPO), tries to deliver the goods that MMO only
promises. Specifically, my goals are that the program must respond in
real time to the soloist's tempo changes and expressive gestures; the
program must learn from past performances so that it assimilates the
soloist's interpretation in future renditions; and it must bring a
sense of musicality to the performance in addition to what is learned
from the soloist. In this way MPO *adds* to the soloist's experience
by providing a responsive and nuanced accompaniment rather than
*subtracting* from it by imposing a rigid framework that stifles
musical expression.
My efforts in MPO have been received with suspicion by some. ``Why
dehumanize music with your cold and unfeeling computers?'' I assure
the wary reader that my goal is not to take music out of the hands of
people. To the contrary, my desire is just the opposite --- to make
music available to the overwhelming majority of folks who don't have a
live accompanist handy, thereby getting more people involved with
music. I have found MPO to be an invaluable tool in preparing for
performance and learning new music. Additionally, MPO enables the
performance of music that would be unplayable otherwise (for example,
see the piece by Nick Collins described with my examples). But most of all, MPO transforms
an otherwise sterile practice session into a more aesthetically
satisfying, immediate, and fun experience. My hope is that many other
musicians will come to enjoy the fuller musical experience afforded by
my efforts.
My technical approach divides the accompaniment problem into two
components named by the analogous human actions: *Listen* and *Play*.
On a computer, these tasks are performed by a collection of algorithms
composed of over 50,000 lines of c code. The human accompanist must
both listen and play simultaneously, thus, the nature of the
accompaniment problem requires that Listen and Play run concurrently
on the computer.
Before a computer (or human) can accompany a soloist, it must first
*hear* the soloist; this is role of the Listen component. The
actual input to Listen is a digitized acoustic signal sampled at 8KHz.
As the soloist plays, the signal accumulates inside the computer and
must be analyzed to determine, at any given time, what notes have been
played and exactly when they were played. The problem is greatly
simplified by the fact that the computer ``knows'' the musical score
giving the basic template from which the soloist plays. However,
Listen must be robust to inaccuracies and embellishments on the part
of the soloist while maintaining precise accuracy in matching the
acoustic signal to the soloist's part. Listen accomplishes this using
a hidden Markov model: the actual performance, including the times at
which all note onsets occur, is modeled as an *unobserved* sequence of
random variables describing various positions in the score --- a
Markov chain. The acoustic signal is then assumed to be
probabilistically related to the hidden process. The HMM allows one
to perform inference on the hidden process given whatever information
is currently at hand. For instance, familiar algorithms from HMMs
allow one to compute the probability that the soloist has passed a
certain score position, given the current sound data, or the most
likely onset time for a particular solo note. Furthermore, the HMM is
automatically trainable so it can adapt to new acoustic environments
such as solo instrument, room acoustics, and microphone placement.
The task of the Play component is the actual construction of the
accompaniment. As with the human accompanist, this problem involves
the fusion of several disparate knowledge sources containing
relevant information.
One knowledge source is, of course, the output of Listen, which
is essentially a running commentary on the soloist's performance identifying
the onset time of each note, delivered with variable latency.
In addition, Play must also understand the basic
template for musical performance that is described in the musical
score, (notes, rhythms, etc.), thereby allowing the system to
``sight-read'' (perform with no training) credibly. However, Play
must also be able to improve over successive rehearsals
much as live musicians do; thus another knowledge source consists of a
collection of past solo performances, demonstrating the rhythmic
interpretation of the soloist. Finally, several
performances of the accompaniment by a live player
allow Play to capture a sense of musicality in
places where its interpretation cannot be inferred from the soloist.
Play models the problem through a collection of hundreds of Gaussian
random variables whose mutual dependence is expressed through
a graph --- a Bayesian Belief Network. Some vertices in the graph represent
observable variables, such as the estimated
note onset times produced by Listen and the known times at which accompaniment
notes are played. The remaining unobservable variables,
such as local tempo and rhythmic stress represent variables that characterize
the soloist's musical interpretation.
The connectivity of the graph expresses various musically-plausible
conditional independence assumptions that are key in making the
computations feasible in real-time. During a rehearsal phase, the
network can be trained
to model the interpretations demonstrated by the solo and accompaniment
examples, thereby allowing the system to automatically adapt to each
new musical situation. Once this is done, the network provides
the basis for a real-time performance engine that guides the
accompaniment through a course of actions, each informed by
all currently-available knowledge.
More recently my research has considered two variations on the
basic theme described. In the first, rather than using an accompaniment
generated with MIDI (Musical Instrument Digital Interface), I
synthesize the sound data from a sampled audio recording of the accompaniment
(courtesy of MMO).
Sampled audio provides a much richer and more nuanced sonic palette
than does MIDI, and allows for full orchestral accompaniment of a soloist.
The second research direction considers polyphonic solo input, thus allowing
a piano or several instruments to play the role of soloist.
A piano concerto is ideal for demonstrating the natural
connection between these two paths.
My web page contains
an excerpt from the second movement of Rachmaninov's beloved 2nd piano concerto
as accompanied by my MPO. Many other examples and some discussion of my
research is also available there.
This research has been funded by two National Science Foundation grants and
has been the subject of many invited talks in disciplines ranging
through Computer Science, Statistics, Acoustics, and Electronic Music.
Additionally,
the work has attracted considerable attention in the popular press
including a radio interview on the BBC World Service program
Go Digital, an article on the ABC news website, an article
in the Boston Globe Sunday Magazine, as well as other coverage
in Discover Magazine, Science Update, Komp'iuternoe Obozrenie (Computer Survey),
New Scientist Magazine, and others. However,
my real dream is that
this work will go beyond both the scientific community and popular press,
and be embraced by practicing musicians.
As has been the case with nearly all technological encroachments into
the musical world, I expect considerable resistance in this endeavor.
However, I also believe that accompaniment systems will someday be
as commonplace in the musician's toolbox as the metronome and tuner ---
but much more appreciated.
As an applied mathematician,
my highest goal is to make a lasting contribution in the application
domains I study. In this way I hope to give something back to
the art that has been a constant source of joy and inspiration for me.