Harmonic Structure Features for Robust Speaker Diarization

Harmonic Structure Features for Robust Speaker Diarization
Harmonic Structure Features for Robust Speaker Diarization

ㆍ 저자명: Zhou. Yu,Suo. Hongbin,Li. Junfeng,Yan. Yonghong
ㆍ 간행물명: ETRI journal
ㆍ 권/호정보: 2012년|34권 4호|pp.583-590 (8 pages)
ㆍ 발행정보: 한국전자통신연구원
ㆍ 파일정보: 정기간행물|ENG|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

In this paper, we present a new approach for speaker diarization. First, we use the prosodic information calculated on the original speech to resynthesize the new speech data utilizing the spectrum modeling technique. The resynthesized data is modeled with sinusoids based on pitch, vibration amplitude, and phase bias. Then, we use the resynthesized speech data to extract cepstral features and integrate them with the cepstral features from original speech for speaker diarization. At last, we show how the two streams of cepstral features can be combined to improve the robustness of speaker diarization. Experiments carried out on the standardized datasets (the US National Institute of Standards and Technology Rich Transcription 04-S multiple distant microphone conditions) show a significant improvement in diarization error rate compared to the system based on only the feature stream from original speech.

키워드

Speaker diarization speech resynthesis resynthesized speech cepstral features

다운URL