Home > People > guohong_ding
Guohong Ding
Member of research staff
NRC Beijing
Contact
Professional Activities |
Reviewer of IEEE Transaction on Audio Speech and Language Proceeding
| | |
Research Interests | | My interests are different topics in speech recognition area, including noise robustness speech recognition, pronuniciation generation, language modeling, acoustic modeling, accented speech recognition and fusion of different modalities (multimodal). I was one of the main contributors of Nokia large vocabulary continuous speech recognition system
| | |
Research Projects |
1. Speaker independent Name dialing
On letter to phoneme conversion for different Asian languages, e.g. Vietnamese, Thai, and Japanese
Optimization for decision tree based pronunciation generation
2. Discontinuous dictation for Mandarin
Recognition engine specially for Mandarin, regular pronunciation represented by Pinyin for Chinese character, no distinct grammar, no clear word boundary
Language modeling
3. Continuous dictation
On acoustic modeling and language modeling, Collerate with Jesper OIsen, who was responsible for continuous dictation engine and some others
On Mandarin, for low footprint (~6M bytes), results are 86.13% and 87.04% for speaker independent mode and gender dependent mode, respectively
On US English, for low footprint (~6M bytes), results are 84.08% and 85.26% for speaker independent mode and gender dependent mode, respectively
4. Multimodal input
Fusion of the handwriting recognition and speech recognition, along with other modalities, e.g. Pinyin input and stroke input
| | |
Personal Information | | I received the B.Eng. degree and the M.Eng. degree from Northwestern Polytechnical University in 1998 and 2001, respectively, both in automatic control and the Ph.D. degree from Instutute of Automation, Chinese Academy of Sciences, in 2004 on speech recognition
| | |
Publications | [1] G.-H. Ding, Phonetic Confusion Analysis and Robust Phone Set Generation for Shanghai-Accented Mandarin Speech Recognition, Interspeech, 149-152, 2008
[2] G.-H. Ding, Maximum a Posteriori Noise Log-Spectral Estimation Based on First-Order Vector Taylor Series Expansion, IEEE Signal Processing Letters, 15: 158-161, 2008
[3] J. Olsen, Y. Cao, G. Ding, X. Yang, A Decoder for Large Vocabulary Continuous Short Message Dictation on Embedded Devices, IEEE International Conference on Acoustics, Speech and Signal Processing, 4337-4340, 2008
[4] J. Alhonen, Y. Cao, G. Ding, Y. Liu, J. Olsen, X. Wang, X. Yang, Mandarin Short Message Dictation on Symbian Series 60 Mobile Phones, International Conference on Mobile Technology, Applications, and Systems, 431-438, 2007
[5] Y. Tang, W.-J. Liu, H. Zhang, B. Xu, G.-H. Ding, One-Pass Coarse-to-Fine Segmental Speech Decoding Algorithm, IEEE International Conference on Acoustics, Speech and Signal Processing, 441-444, 2006
[6] G.-H. Ding, X. Wang, Y. Cao. F. Ding, Y. Tang, Sequential Noise Estimation for Noise-Robust Speech Recognition based on 1st-Order VTS Approximation, IEEE Workshop on Automatic Speech Recognition and Understanding, 363-368, 2005
[7] G.-H. Ding, X. Wang, Y. Cao, F. Ding, Y. Tang, Speech Enhancement based on Speech Spectral Complex Gaussian Mixture Model, IEEE International Conference on Acoustics, Speech and Signal Processing, 165-168, 2005
[8] G.-H. Ding, B. Xu, X. Wang, Y. Cao, F. Ding, Y. Tang, Task-Specific Adaptation in Chinese Name Recognition, International Symposium on Chinese Spoken Language Processing, 261-264, 2004
[9] G.-H. Ding, B. Xu, Exploring High-Performance Speech Recognition in Noisy Environments using High-Order Taylor Series Expansion, International Conference on Spoken Language Processing, 149-152, 2004
[10] G.-H. Ding, T. Huang, B. Xu, Suppression of Additive Noise Using A Power Spectral Density MMSE Estimator, IEEE Signal Processing Letters, 11(6): 585-588, 2004
[11] G.-H. Ding, B. Xu, Fast Speaker Adaptation based on Triple-Diagonal and Shared Block-Diagonal Transform Matrices, Chinese Electronic Journal (In Chinese), 32(10): 1709-1712, 2004
[12] G.-H. Ding, B. Xu, J. Iso-Sipil?, Y. Cao, Transform-Based Fast Speaker Adaptation Using Triple Diagonal and Shared Block Diagonal Matrices, IEEE International Conference on Acoustics, Speech and Signal Processing, 300-303, 2003
[13] G.-H. Ding, Y.-F. Zhu, C. Li, B. Xu, Implementing Vocal Tract Length Normalization in the MLLR Framework, International Conference on Spoken Language Processing, 1389-1392, 2002
[14] G.-H. Ding, C. Li, B. Xu, Comparison of MLLR and CDCN for Speech Recognition in Additive Noise by Experiments, International Symposium on Chinese Spoken Language Processing, 103-106, 2002
[15] G.-H. Ding, C. Li, B. Xu, Implementation of Independent-Speaker Isolated Word Recognition System on Fixed DSP, National Conference of Man-Machine Speech Communication (In Chinese), 371-374, 2001
| | |
Patents | | G.-H. Ding, X. Wang, Y. Cao, F. Ding, Y. Tang, High quality thai text-to-phoneme converter, US20060259301
| | |
|