| SAS Publisher

Scholars Journal of Physics, Mathematics and Statistics | Volume-2 | Issue-02

Comparison of criteria for the selection of discriminating variables: Application in Credit-Scoring

Hicham Y. Abdallah, Afif A. Hayek

Published: April 14, 2015 | 104 92

DOI: 10.36347/sjpms

Pages: 176-190

Downloads

Abstract

Banks want to reduce the credential risk by applying rules in order to classify the new loan seekers into “good customers” and “bad customers”. Searching past data is the best solution to build a statistics strategy to show this kind of risk. In general, a lot of data should be analyzed using “Data Mining”, the computational process of discovering patterns in large data sets involving methods, formalizing the problem of credential risk that the bank is seeking to resolve in terms of data (classification tree), while the dependent variable is qualitative and takes two forms: "good payers" and "defaulters". From that, prepare the data for treatment (selection of the most discriminating variables, collinearity diagnosis). Finally model the data by logistic regression and the decision tree CART. This article aims to build these two classification models from a database of 1000 customers by using first the chi-square criteria χ2 and secondly Rand as a detector of discriminating variables in order to choose the most appropriate criterion