Today, the idea of making data-driven decisions is becoming mainstream. As the volume of available data continues its exponential ascent, so has the prospect of unlocking the hidden secrets that lie within these massive data stores. Predictive analytics is the art and science of analyzing and modeling data to gather insights that can be used to make meaningful business decisions. In the lending industry, this could translate to finding patterns in historical data to identify a commonality between borrowers.
These insights could be used for driving lending policies at both the individual and the portfolio level. In this article, we present an approach for analysis of data and how these insights are extracted in a consumer lending scenario.
Measuring the risk of an individual borrower
Is the loan applicant a “good” borrower or a “bad” borrower? Our task is to provide a quantitative solution to this question that all lending institutions face on a daily basis. In predictive analytics, we call it a classification problem.
- Identify a dataset. Borrower-related financial data such as annual income, loan amount, debt-to-income ratio, homeownership status, etc. can be used to develop an appropriate predictive model.
- Customize Identifying and combining the identified dataset with relevant data from other relevant sources. For example, macroeconomic measures such as unemployment rate, inflation rate, real GDP, etc., which are publically available data. In this step, the data scientist may also consult a business expert to gain more knowledge regarding the business.
- Select a Model. Choosing the most appropriate technique can be a complex task for the data scientist. For example, depending on the nature of the data, a data scientist can choose from traditional statistical methods to more contemporary machine learning algorithms. In the consumer lending classification problem, the logistic regression is an appropriate modeling technique as it applies a simple, robust and time-tested solution.
- Validate Results. A data scientist utilizes a plethora of performance evaluation measures to make sure that the model’s accuracy is adequate and consistent and the model’s predictions make sense. For example, in the consumer lending problem, optimization of the number of correctly identified “bad” borrowers is essential.
Measuring loan portfolio risk
To provide a more comprehensive analysis of overall risk across the portfolio, a data scientist can leverage prediction results to produce more insightful data visualization tools. Questions such as: “How does annual income affect creditworthiness if the loan applicant resides in the West?” or “How does creditworthiness change when debt-to-income ratio experiences a unit change for individuals who own their homes?” We need informative and meaningful answers that allows the executive to make business decisions expeditiously. In essence, the extra level of analysis provided by the predictions from the model allow the user to dig deeper and identify opportunities or uncover problems.
Though the science behind the construction of predictive models can be complex, the rigor associated with the process ensures that the models that are produced are accurate and effective models that are realistic in the business world. In an industry such as consumer lending, accuracy is the name of the game as identifying even a small number of bad borrowers can lead to huge savings for the lending institution. In a following article, Predictive analytics for consumer lending: Managing portfolio risk, we will present some ways to improve decision-making and managing portfolio risk using the insights provided using predictive analytics.