CRFs를 이용한 의존구조 분석 및 의존 관계명 부착

CRFs를 이용한 의존구조 분석 및 의존 관계명 부착

ㆍ 저자명: 최맹식,정석원,김학수,Choi. Maengsik,Jeong. Seokwon,Kim. Harksoo
ㆍ 간행물명: 정보과학회논문지. Journal of KIISE. 소프트웨어 및 응용
ㆍ 권/호정보: 2014년|41권 4호|pp.302-308 (7 pages)
ㆍ 발행정보: 한국정보과학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

한국어 문장의 구조를 분석하기 위해 의존구조 분석을 많이 사용한다. 대부분의 의존구조 분석 방법은 어절 사이의 의존 관계 유무만을 결과로 제시하며 주어, 목적어 그리고 수식어 등의 정보를 제공하지 않는다. 본 논문에서는 의존구조 분석과 의존 관계명 부착을 동시에 수행하는 모델을 제안한다. 제안 방법은 CRFs(Condition Random Fields)를 이용한 다단계 구 단위화(cascade chunking) 방법을 통해 의존구조와 의존 관계명을 결합한 태그를 문장 각각의 어절에 부착한다. 세종 구문 분석 말뭉치를 이용하여 10배 교차 검증 실험을 통해 통합된 모델의 성능(정밀도 81.11%)이 의존구조 분석과 의존 관계명 부착의 2단계 모델보다 높은 성능을 보였다.

기타언어초록

In Korean, dependency parsing is frequently used to analyze syntactic structures of sentences. Most of the previous dependency parsing methods return only whether dependency relations between eojeols (spacing unit of Korean) exist or not. They do not return the names of dependency relations such as subject, object, modifier, and so on. In this paper, we propose an integrated dependency parsing model that finds dependency relations and annotates with dependency labels at the same time. The proposed model annotates each eojeol in a sentence with various tags, which combine dependency relations and dependency labels, by using a cascade chunking method based on conditional random fields (CRFs). In the 10-fold cross validation experiments with Sejong syntactic parsing corpus, the integrated model showed the better performance (the accuracy of 81.11%) than the previous two-step model that annotates with dependency labels after finding dependency relations.

키워드

의존구조 분석 의존 관계명 부착 다단계 구 단위화 dependency parsing dependency label cascade chunking CRFs

다운URL