음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구

음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구

ㆍ 저자명: 임영춘,오세진,김광동,노덕규,송민규,정현열
ㆍ 간행물명: 한국음향학회지= The journal of the acoustical society of Korea
ㆍ 권/호정보: 2003년|22권 5호|pp.388-402 (15 pages)
ㆍ 발행정보: 한국음향학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

In this paper, we carried out the word, 4 continuous digits. continuous, and task-independent word recognition experiments to verify the effectiveness of the re-defined phoneme-likely units (PLUs) for the phonetic decision tree based HM-Net (Hidden Markov Network) context-dependent (CD) acoustic modeling in Korean appropriately. In case of the 48 PLUs, the phonemes /ㅂ/, /ㄷ/, /ㄱ/ are separated by initial sound, medial vowel, final consonant, and the consonants /ㄹ/, /ㅈ/, /ㅎ/ are also separated by initial sound, final consonant according to the position of syllable, word, and sentence, respectively. In this paper. therefore, we re-define the 39 PLUs by unifying the one phoneme in the separated initial sound, medial vowel, and final consonant of the 48 PLUs to construct the CD acoustic models effectively. Through the experimental results using the re-defined 39 PLUs, in word recognition experiments with the context-independent (CI) acoustic models, the 48 PLUs has an average of 7.06%, higher recognition accuracy than the 39 PLUs used. But in the speaker-independent word recognition experiments with the CD acoustic models, the 39 PLUs has an average of 0.61% better recognition accuracy than the 48 PLUs used. In the 4 continuous digits recognition experiments with the liaison phenomena. the 39 PLUs has also an average of 6.55% higher recognition accuracy. And then, in continuous speech recognition experiments, the 39 PLUs has an average of 15.08% better recognition accuracy than the 48 PLUs used too. Finally, though the 48, 39 PLUs have the lower recognition accuracy, the 39 PLUs has an average of 1.17% higher recognition characteristic than the 48 PLUs used in the task-independent word recognition experiments according to the unknown contextual factor. Through the above experiments, we verified the effectiveness of the re-defined 39 PLUs compared to the 48PLUs to construct the CD acoustic models in this paper.

키워드

39유사음소단위 PDT-SSS알고리즘 문맥의존 음향모델 48 HM-Net 48 39 phoneme likely units HM-net (hidden Markov Network)PDT-SSS algorithm Context dependent acoustic models

다운URL