1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.
What sparked your interest in speech and language processing?
The award-winning Japanese-American poet, James Mitsui, taught poetry at my high school for a few years. He taught us about the ways in which you can put multiple meanings into the same line, so that each reader hears what they want to hear. He taught us about how you can hide rhythms and rhymes in plain sight, so that they perk up the poem, or queer it with leery uncertain allusions. Heady stuff, for a seventeen-year-old. I continued to write and read poetry on and off for a long time – well, I still read poetry, anyway – but that never seemed relevant to my career choices until I made it to graduate school.
My interest in speech came from my internship mentors. Mike McLaughlin and Mark Jasiuk, at Motorola, taught me that speech processing is about finding new ways in which you can decompose the signal. Tomohiko Taniguchi and Fumio Amano, at Fujitsu, taught me that speech processing is about understanding the origins of the information communicated in the speech signal – which particular movements of the vocal tract produce each of those sounds, and why. Those two perspectives have really shaped my approach since then.
How do you think speech and language processing is changing the society for the new generation?
Well, it depends on which generation you call “new.” When I grew up, if I wanted to call a friend, I had to go out to the kitchen, and use the telephone right in the middle of the kitchen, with my mom making dinner and my brother and sister wandering in and out. I remember the first time I saw a teenager roller-blading past while talking on his cell phone: it was 1999, sitting outside a bagel store in the National Sepulveda shopping center in LA, and it suddenly hit me: oh, is that what speech coding is for!
Other than that huge breakthrough, though, I’d say that speech has failed to live up to its promise. Spoken language interface – having your e-mail read to you while you drive, for example – works OK, if you’re willing to slow down to deal with the required call-and-response style dialog turn confirmations. Human-to-human conversation is much faster, because humans rely on these subtle cues for turn-taking, and they are able to complete one another’s thoughts by mining the common ground.
What is your holy grail in speech and language processing? When will we achieve it?
The digital divide. I have a smart speaker, and a cell phone who understands me. But I can read and write, so I don't really need those things. Meanwhile, there are 3,000 languages in the world without any writing system. The divide between written and unwritten languages is closely connected, I think, to the divide between rich and poor, and between urban and rural. The quest for speech technology in unwritten languages is the quest for robustness. If a father borrows a cell phone to leave an emergency message with the nearest nurse midwife, things that stop him might include the lack of an automatic call routing agent that works in his language, or his slurred speech, or the sound of the rain on the roof, or the lack of a cell phone tower in the valley where he lives. In a curious sort of way, speech and language technology need to become part of the standard infrastructure that countries provide to their citizens.
Do you have any specific advice for students, junior faculty or others early in their careers?
Learn as much math as you can. Never stop learning. If a paper is hard for you to understand, but if somebody you trust has found something useful in it, then allocate a few hours to breaking down some of the equations.
Don't fake it. If you don't understand something, ask. And then go read the paper afterward.
Being permitted to help people is the greatest honor that our society can bestow. Try to take ideas that were previously impossible, and make them possible, with sufficient quality that they can actually be useful to somebody.
What development in the field has most surprised you? Was there a hard problem that turned out to be easy? An easy problem that proved surprisingly difficult?
I'm surprised every day by problems that are harder than I thought they were, and I'm surprised at least once or twice a year by the ability of smart people to solve them anyway. I told people quite confidently that telephone-band ASR error rates were too high for commercial applications, quite confidently, right up until the day that the first voice agent was announced. Same thing for far-field speech recognition (smart speakers) and speeech-to-speech translation.
Oftentimes, the work that people get noticed for is not the work which they find the most exciting/rewarding/interesting. Which of your publications is your favorite? Why?
I don't have a copy of it any more, but when I was in graduate school, I wrote a paper analyzing the recorded poetry of Robert Frost. I found a 33rpm in the university library, containing recordings made by Frost when he was alive. I had only heard "Stopping by Wood" read aloud in a rather different context: the movie Telefon (1977) uses that poem as a kind of crypto-key, and therefore it's always read in a sort of breathy sing-song half-whisper. When Frost reads it, though, he reads it exactly as if he were reading a letter out loud to an elderly relative. The F0 difference between stressed and unstressed syllables is less than you would find in ordinary conversation. As nearly as I could figure, Frost thought that all of the magic should be in the writing, and that speech should be as deadpan as possible.
© Copyright 2020 IEEE – All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.