Music Plus One Description

What is Music Plus One?
The goals of my ``Music Plus One'' project are similar to the more familiar ``Music Minus One'' (MMO). MMO makes a recording of a piece of music for soloist and accompaniment, such as a sonata or concerto, where only the accompaniment is actually recorded. The music is prefaced by several warning clicks (something like Lawrence Welk's ``and a one and a two and a ...''), and the soloist tries to play along with the recording. A heartfelt yet futile battle of wills follows which eventually results in the live player's unconditional surrender to the robotic insistence of the recording. Thus, contrary to both musical etiquette and common sense, the soloist must follow the accompaniment.

Although MMO has its heart in the right place, in practice it is the antithesis of what the musical experience should be. My project, Music Plus One (MPO), tries to deliver the goods that MMO only promises. Specifically, my goals are that the program must respond in real time to the soloist's tempo changes and expressive gestures; the program must learn from past performances so that it assimilates the soloist's interpretation in future renditions; and it must bring a sense of musicality to the performance in addition to what is learned from the soloist. In this way MPO *adds* to the soloist's experience by providing a responsive and nuanced accompaniment rather than *subtracting* from it by imposing a rigid framework that stifles musical expression.

My efforts in MPO have been received with suspicion by some. ``Why dehumanize music with your cold and unfeeling computers?'' I assure the wary reader that my goal is not to take music out of the hands of people. To the contrary, my desire is just the opposite --- to make music available to the overwhelming majority of folks who don't have a live accompanist handy, thereby getting more people involved with music. I have found MPO to be an invaluable tool in preparing for performance and learning new music. Additionally, MPO enables the performance of music that would be unplayable otherwise (for example, see the piece by Nick Collins described with my examples). But most of all, MPO transforms an otherwise sterile practice session into a more aesthetically satisfying, immediate, and fun experience. My hope is that many other musicians will come to enjoy the fuller musical experience afforded by my efforts.

My technical approach divides the accompaniment problem into two components named by the analogous human actions: *Listen* and *Play*. On a computer, these tasks are performed by a collection of algorithms composed of over 50,000 lines of c code. The human accompanist must both listen and play simultaneously, thus, the nature of the accompaniment problem requires that Listen and Play run concurrently on the computer.

Before a computer (or human) can accompany a soloist, it must first *hear* the soloist; this is role of the Listen component. The actual input to Listen is a digitized acoustic signal sampled at 8KHz. As the soloist plays, the signal accumulates inside the computer and must be analyzed to determine, at any given time, what notes have been played and exactly when they were played. The problem is greatly simplified by the fact that the computer ``knows'' the musical score giving the basic template from which the soloist plays. However, Listen must be robust to inaccuracies and embellishments on the part of the soloist while maintaining precise accuracy in matching the acoustic signal to the soloist's part. Listen accomplishes this using a hidden Markov model: the actual performance, including the times at which all note onsets occur, is modeled as an *unobserved* sequence of random variables describing various positions in the score --- a Markov chain. The acoustic signal is then assumed to be probabilistically related to the hidden process. The HMM allows one to perform inference on the hidden process given whatever information is currently at hand. For instance, familiar algorithms from HMMs allow one to compute the probability that the soloist has passed a certain score position, given the current sound data, or the most likely onset time for a particular solo note. Furthermore, the HMM is automatically trainable so it can adapt to new acoustic environments such as solo instrument, room acoustics, and microphone placement.

The task of the Play component is the actual construction of the accompaniment. As with the human accompanist, this problem involves the fusion of several disparate knowledge sources containing relevant information. One knowledge source is, of course, the output of Listen, which is essentially a running commentary on the soloist's performance identifying the onset time of each note, delivered with variable latency. In addition, Play must also understand the basic template for musical performance that is described in the musical score, (notes, rhythms, etc.), thereby allowing the system to ``sight-read'' (perform with no training) credibly. However, Play must also be able to improve over successive rehearsals much as live musicians do; thus another knowledge source consists of a collection of past solo performances, demonstrating the rhythmic interpretation of the soloist. Finally, several performances of the accompaniment by a live player allow Play to capture a sense of musicality in places where its interpretation cannot be inferred from the soloist.

Play models the problem through a collection of hundreds of Gaussian random variables whose mutual dependence is expressed through a graph --- a Bayesian Belief Network. Some vertices in the graph represent observable variables, such as the estimated note onset times produced by Listen and the known times at which accompaniment notes are played. The remaining unobservable variables, such as local tempo and rhythmic stress represent variables that characterize the soloist's musical interpretation. The connectivity of the graph expresses various musically-plausible conditional independence assumptions that are key in making the computations feasible in real-time. During a rehearsal phase, the network can be trained to model the interpretations demonstrated by the solo and accompaniment examples, thereby allowing the system to automatically adapt to each new musical situation. Once this is done, the network provides the basis for a real-time performance engine that guides the accompaniment through a course of actions, each informed by all currently-available knowledge.

More recently my research has considered two variations on the basic theme described. In the first, rather than using an accompaniment generated with MIDI (Musical Instrument Digital Interface), I synthesize the sound data from a sampled audio recording of the accompaniment (courtesy of MMO). Sampled audio provides a much richer and more nuanced sonic palette than does MIDI, and allows for full orchestral accompaniment of a soloist. The second research direction considers polyphonic solo input, thus allowing a piano or several instruments to play the role of soloist. A piano concerto is ideal for demonstrating the natural connection between these two paths. My web page contains an excerpt from the second movement of Rachmaninov's beloved 2nd piano concerto as accompanied by my MPO. Many other examples and some discussion of my research is also available there.

This research has been funded by two National Science Foundation grants and has been the subject of many invited talks in disciplines ranging through Computer Science, Statistics, Acoustics, and Electronic Music. Additionally, the work has attracted considerable attention in the popular press including a radio interview on the BBC World Service program Go Digital, an article on the ABC news website, an article in the Boston Globe Sunday Magazine, as well as other coverage in Discover Magazine, Science Update, Komp'iuternoe Obozrenie (Computer Survey), New Scientist Magazine, and others. However, my real dream is that this work will go beyond both the scientific community and popular press, and be embraced by practicing musicians. As has been the case with nearly all technological encroachments into the musical world, I expect considerable resistance in this endeavor. However, I also believe that accompaniment systems will someday be as commonplace in the musician's toolbox as the metronome and tuner --- but much more appreciated. As an applied mathematician, my highest goal is to make a lasting contribution in the application domains I study. In this way I hope to give something back to the art that has been a constant source of joy and inspiration for me.