Default Prediction for Real Estate Companies with Imbalanced Dataset

Default Prediction for Real Estate Companies with Imbalanced Dataset
Default Prediction for Real Estate Companies with Imbalanced Dataset

ㆍ 저자명: Dong. Yuan-Xiang,Xiao. Zhi,Xiao. Xue
ㆍ 간행물명: Journal of information processing systems
ㆍ 권/호정보: 2014년|10권 2호|pp.314-333 (20 pages)
ㆍ 발행정보: 한국정보처리학회
ㆍ 파일정보: 정기간행물|ENG|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

When analyzing default predictions in real estate companies, the number of non-defaulted cases always greatly exceeds the defaulted ones, which creates the two-class imbalance problem. This lowers the ability of prediction models to distinguish the default sample. In order to avoid this sample selection bias and to improve the prediction model, this paper applies a minority sample generation approach to create new minority samples. The logistic regression, support vector machine (SVM) classification, and neural network (NN) classification use an imbalanced dataset. They were used as benchmarks with a single prediction model that used a balanced dataset corrected by the minority samples generation approach. Instead of using prediction-oriented tests and the overall accuracy, the true positive rate (TPR), the true negative rate (TNR), G-mean, and F-score are used to measure the performance of default prediction models for imbalanced dataset. In this paper, we describe an empirical experiment that used a sampling of 14 default and 315 non-default listed real estate companies in China and report that most results using single prediction models with a balanced dataset generated better results than an imbalanced dataset.

키워드

Default prediction Imbalanced dataset Real estate listed companies Minority-sample generation approach

다운URL