What is your area of expertise?
I work in acoustic modeling in automatic speech recognition and for non-speech acoustic event detection.
Brief synopsis of your education and career?
I received my bachelor’s and master’s degree at the same time in 1989 from MIT, and then I got my doctoral degree in 1996 from MIT. My adviser was Ken Stevens, who recently passed away. I did a post-doc, and my first year was funded by a fellowship from the Acoustical Society of America. Then the remaining two years were funded by an NIH individual research service award.
What is the best thing about being at Illinois?
The best thing about being at Illinois is the breadth and depth of expertise. No matter the area you want to research, someone on campus is likely the world’s leading expert on the piece of the puzzle you are missing. Any research happens at the intersection between areas of expertise. If I could solve a problem using my expertise alone, I would have already solved it. Interesting research happens where my expertise meets somebody else’s.
How did you become interested in your particular area of engineering?
I became an engineer because in high school, I liked math and computers, and somebody told, ‘Oh, you should be an engineer.’ And at the time, I didn’t know what the word even meant. I developed an interest in signal processing gradually during a series of internships while I was an undergraduate. The summer after my freshman year, I had an internship at Boeing, working with power electronics. All of the things we were doing were fascinating, but the part that was most interesting to me was dealing with the switching power supplies. So after I decided that I was interested in signal processing, I went to Motorola, and I worked in a radio communications research lab for another internship after my sophomore year. The parts that were most interesting to me about that work were the parts dealing with speech audio. By the time of my third and fourth internships after my junior and senior years, I was working on low-bit-rate automatic speech coding.
What research accomplishment are you most proud of in your time here?
I am most proud of the work our graduate students accomplish. For example, a pair of students working with me, Xiaodan Zhuang and Xi Zhou, had had far and away the best performance in the 2007-2008 Acoustic Event Detection task, and they did that by chaining together different systems. They took one piece from speech recognition technology and another piece from computer vision. They combined them in a way that allowed each system to solve the problems of the previous system. The ability to put these different technologies together, so that each one complements the flaws of the previous one, was outstanding.
Which of the awards/honors that you have received has meant the most to you?
It was quite meaningful to me to receive the Dean’s Award for Excellence in Research. The year I received tenure, I also received this award from the Dean of the College of Engineering for my publication and research contribution record. It was meaningful to me because it was a recognition from this college of my accomplishments.
What research are you currently focused on?
A lot of my work right now is focused around the idea of vocabulary, actually. The way speech recognition works right now is, you tell the device what are the possible words that might occur next and then it does a one-event classification. So, if you tell it that a thousand words could come next, it does quite well; it has about a 1 percent error rate. But if you have to choose between 60,000 words that could come next, you might have 60 percent error rate. In any speech recognition test, you propose the lists of words that could happen next. If you get the wrong list, it could propagate through the whole sentence and make everything impossible.
What do you enjoy the most about teaching classes?
What I enjoy most is when I discover that I don’t know something as well as I thought I knew it. I go back and investigate. It is especially enjoyable when I discover that nobody knows a topic as well as I thought they did, which has often happened. I will discover that there is an area that is covered in every standard textbook, and every standard textbook quotes the same source. I’ll learn this source actually didn’t say what everybody thinks that it said, or it wasn’t as precise as everybody thinks it was. That presents a research opportunity. In a couple of cases, I have publications in which I go back and fill in the gaps of what everybody thought that they knew.
What applications do you believe speech recognition technology will have in the future?
One new idea — it’s developed in the last couple of years — is this idea of ubiquitous computing. One can ask if this concept was in the original “Star Trek,” but I don’t think so. I believe it’s new. The idea is: a human-computer interface is a microdot that you put on the wall, and it is part of the building material or your clothing. Everything that you wear or everything that you walk through can be, if you want it to be, connected to the Internet. If you want information, you should be able to get information at any time, just by asking for it. I shouldn’t have to press a button on my cellphone. I should just be able to say, “Ok Google, who is the Ikenberry Commons named after?” My avatar should be able to come back to me and tell me it’s named after Stan Ikenberry. I think that this is a progression already in place. All of my files are on a couple of different companies’ cloud servers. Every proposal that I work on or every paper I write is available to me on any of my devices. It’s happening now. We have user interfaces that look like a cellphone or like a pair of glasses, but now people are working on user interfaces that look like a shirt. The technology is progressing faster than most people think.