TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries
TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

ㆍ 저자명: Song. Min
ㆍ 간행물명: Journal of information science theory and practice : JISTaP
ㆍ 권/호정보: 2014년|2권 1호|pp.6-21 (16 pages)
ㆍ 발행정보: 한국과학기술정보연구원 정보서비스센터
ㆍ 파일정보: 정기간행물|ENG|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.

키워드

Semantic Query Expansion Information Extraction Information Retrieval Text Mining

다운URL