- Set Covering 기반의 대용량 오믹스데이터 특징변수 추출기법
- ㆍ 저자명
- 마정우,안기동,김광수,류홍서,Ma. Zhengyu,Yan. Kedong,Kim. Kwangsoo,Ryoo. Hong Seo
- ㆍ 간행물명
- 韓國經營科學會誌
- ㆍ 권/호정보
- 2014년|39권 4호|pp.75-84 (10 pages)
- ㆍ 발행정보
- 한국경영과학회
- ㆍ 파일정보
- 정기간행물| PDF텍스트
- ㆍ 주제분야
- 기타
In this paper, we dealt with feature selection problem of large-scale and high-dimensional biological data such as omics data. For this problem, most of the previous approaches used simple score function to reduce the number of original variables and selected features from the small number of remained variables. In the case of methods that do not rely on filtering techniques, they do not consider the interactions between the variables, or generate approximate solutions to the simplified problem. Unlike them, by combining set covering and clustering techniques, we developed a new method that could deal with total number of variables and consider the combinatorial effects of variables for selecting good features. To demonstrate the efficacy and effectiveness of the method, we downloaded gene expression datasets from TCGA (The Cancer Genome Atlas) and compared our method with other algorithms including WEKA embeded feature selection algorithms. In the experimental results, we showed that our method could select high quality features for constructing more accurate classifiers than other feature selection algorithms.