- Multiple Testing in Genomic Sequences Using Hamming Distance
- Multiple Testing in Genomic Sequences Using Hamming Distance
- ㆍ 저자명
- Kang. Moonsu
- ㆍ 간행물명
- 한국통계학회 논문집
- ㆍ 권/호정보
- 2012년|19권 6호|pp.899-904 (6 pages)
- ㆍ 발행정보
- 한국통계학회
- ㆍ 파일정보
- 정기간행물|ENG| PDF텍스트
- ㆍ 주제분야
- 기타
High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.