9/12/2017 Victoria Halewicz, ECE ILLINOIS
Written by Victoria Halewicz, ECE ILLINOIS
Incredible things happen at the intersection of music and computing. Professor Emeritus James W Beauchamp has focused much of his career in this space. Earlier this summer, he was one of 50 artists and audio processing researchers who gathered in Evanston, Illinois, at Northwestern Engineering’s 2017 Midwest Music and Audio Day to share their latest discoveries.
Beauchamp and three ECE ILLINOIS undergraduates, An Zhao, Yifei Teng, Yujia Qiu, presented papers at this year’s symposium alongside researchers from New York University, Carnegie Mellon University, University of Rochester, University of Michigan, Indiana University, University of Iowa, and Northwestern University, as well as industry professionals from Bose, Shure, and Soundslice. Recent Illinois alumnus Yang Shi (BSEE '17, BS Physics '17), now at UCLA, coauthored the paper that Beauchamp presented.
Zhao and Beauchamp presented “Single musical tone time-scaling that preserves temporal structure.” The researchers have devised a new method of time-scaling instrument sounds which retains sound quality and realism in computer-generated music.
“The method starts by splitting a prototype sound into three parts: attack, middle, and decay,” Beauchamp said. “Elongation of the sound is done by zigzagging a time pointer within the middle portion of the sound. To avoid audible clicks resulting from phase reversals at the zigzag points, it was decided to apply the method to amplitude- and frequency-vs-time envelopes of a Fourier series representation. Since the important attack and decay portions of the sound are retained, a realistic synthesis is achieved.”
In “Machine composition of pop and rock music,” a talk by Teng and Zhao, the duo presented a “hybrid neural network and rule-based system” for musical composition. They analyzed more than 10,000 MIDI files, breaking them into training samples for the autoencoder. Leveraging isolated melody and chords from the tracks, the system generates new melodies based on this learned “grammar.”
Beauchamp conducted a demonstration, "Preserving vibrato rate while time-scaling single vibrato tones." Online audio programs like SongSurgeon and Spear can slow down or speed up audio without changing pitch, but they are unable to retain the rate of vibrato. Beauchamp and Shi overcame this limitation by using a Short Time Fourier Transform to incorporate vibrato rate as a parameter along with vibrato depth and frequency.
Beauchamp said, “First, spectrum analysis is performed on the input signal, resulting in time-varying amplitudes and frequencies of the harmonics. Second, the amplitudes and frequencies are parameterized using the vibrato model. Finally, the sound is resynthesized at an arbitrary duration using the additive synthesis method.”
Finally, Qiu’s poster “Maximum likelihood chord detection” centered on recognizing one of the 12 major chords from an audio file. Her research detailed techniques to find the three most likely notes in the audio using the FFT and Maximum Likelihood methods. The algorithm then verifies and identifies the chord.
Learn more about the event from Northwestern’s coverage of the event. Additional content provided by Julia Sullivan, ECE ILLINOIS.