Imre Kiss
Principal ScientistInteraction in Smart Environments
Background
I joined Nokia Research Center (NRC) in July, 1996. I started in the Speech and Audio Systems Laboratory as a research engineer working on distributed speech recognition. In parallel to my work I pursued post-graduate studies at Tampere University of Technology, Department of Information Technology, where I received my Doctor of Technology degree in 2001. During my years at NRC I worked in various technical and managerial positions: I started as a research engineer, then senior research engineer, research manager and lately principal scientist. My academic background is in telecommunications and signal processing, however, most of my active scientific carrier centered around various aspects of speech processing. I am passionate about technology and like solving challenging scientific and engineering problems.
Research Interests
-
Pattern recognition, statistical models, natural language processing
-
Automatic Speech Recognition (feature extraction, acoustic- and language modeling, decoding, low-footprint and low-complexity implementations, language resource creation)
-
Text-to-Speech synthesis (concatenative TTS, acoustic synthesis, prosody modeling, compression)
-
Context- and content-adaptive intelligent user interfaces
-
Speech-to-Speech translation
-
New paradigms for modeling intelligence
Research Projects
Publicly funded projects
-
I was actively involved in the SpeechDatCar, SPEECON and LC-STAR projects carrying out acoustic and textual language resource collection for a wide variety of languages. We are actively using the results of these projects for our everyday research activities involving statistical modelling of speech and language.
-
SEPEMCO (Search of Personal Multimedia Content) - The project was funded by TEKES, the Finnish Funding Agency for Technology and Innovation. With our university partners we build content-based search application for multimedia content stored on mobile devices.
-
TC-STAR (Technology and Corpora for Speech-to-Speech Translation) - this European Union funded project had the ambitious objectives of making a breakthrough in Speech-to-Speech Translation (SST) research to significantly reduce the gap between human and machine performance. Nokia's main involvements were on embedded, noise robust ASR, concatenative TTS and voice conversion. More information about the project can be found at www.tc-star.org.
Internal
(Disclaimer: the list of internal projects and the presented information are not complete. Check for relevant publications here.)
-
Embedded Mobile Dictation - Our team developed a very compact large vocabulary isolated word dictation system for embedded devices. The system has low footprint, low complexity and high accuracy. It is currently available for several European and American languages and also for Mandarin Chinese.
-
Voice Conversion - We developed a light-weight voice conversion system built around a very low bitrate parametric speech codec. Among other uses, the system makes it possible to alter the voice identity of a TTS system without the need for a large and expensive acoustic database.
-
Concatenative Text-to-Speech Synthesis - Most of our activities in this area concentrated on embedded Mandarin and English TTS. Various aspects of prosody modeling, acoustic synthesis and compression were addressed.
-
Speaker-Independent Name Dialing - Embedded multilingal speech recognition has been a long-lasting research topic in our team. Results of this research coupled with a significant amount of development and implementation work by our colleagues led to a wide-scale commercial deployment of the technology today.
-
Content-adaptive UI - The use of natural language in everyday mobile communication is pervasive. Making phone calls, sending SMS, MMS messages or emails all involve language, let it be our mother tongue or a second or third language. In this project we are looking into various ways of effectively using information from natural language communication to improve the user interface and user experience of devices.
University Cooperation
-
Simone and Start Mobile projects with MIT CSAIL lab
-
VoiceUI with Tampere University of Technology
Other Information
Professional Society Memberships and Activities
- Member of IEEE (1996-), IEEE/CS Society, and ISCA (International Speech Communication Association, 1999-)
- Acted as reviewer for IEEE Transactions on Audio, Speech and Language Processing; Pattern Recognition Letters and ETRI Journal
- Panelist at ACM MobileHCI on "Speech in Mobile and Pervasive Environments (SiMPE), Sept 2006, Helsinki, Finland.
Education
-
2003 Spring - Visiting Researcher at MIT CSAIL laboratory, Cambridge, MA, USA, working on low-footprint implementation of Finite-State Transducer networks for embedded applications
-
2001 - Doctor of Technology, Tampere University of Technology (TUT), Finland. Doctoral dissertation on automatic speech recognition in mobile communication networks
-
1996 - M.Sc. in Electrical Engineering (with honors), Technical University of Budapest, Hungary. Majoring in telecommunications and signal processing. Master's thesis on voice coding.
Publications
I authored or co-authored one book chapter and more than 25 conference and journal papers on various aspects of automatic speech recogniton, text-to-speech synthesis, voice conversion, language resources, and the use of the above in embedded applications. Click here for a complete list of publications.
Patents
-
2 granted patents
-
8 pending patent applications