Synthesis of Musical Expression

Synthesis of Musical Expression


Prosodic Markup of "Danny Boy"		Prosodic Markup of "Amazing Grace"

The amplitude and frequency functions for "Danny Boy"

Over the past couple of decades there has been a good deal on the synthesis of expressive piano performance, as in the RENCON competition. My related work instead looks at the problem of synthesizing expressive audio for melody, which finds its richest form with "continuosly controlled" instruments. Here the musical performance is represented in terms of two time-varying functions, one for pitch and one for intensity. This allows the performance to capture many different aspects of expression, such as timing, within- and between-note dynamics, vibrato, glissando, pitch foreshadowing, and some degree of articulation. The basic emphaisis here is on capturing prosodic (stress and grouping) aspects of the melody. To do this I have developed a set of labels, assigned to each melody note, that describe the function the note plays in a larger context, including notions of forward motion, arrival, and receeding motion. This melody markup is the way in the the prosodic interpretation is represented. One can then look at the expressive rendering problem as one of estimation, where one seeks to estimate the labels for each melody note. A "hardwired" algorithm constructs the pitch and intensity function from the prosodic labeling.

Numerous examples of synthesized melodies, from hand-labeled prosody, computer estimated prosody, and random prosody are shown here

Here is a paper that descibes this work.