By: Agus Sudjianto, Jacob Kosoff and Aaron Bridgers
There is a lot of hype around the benefits of using of artificial intelligence and machine-learning models for credit underwriting.
However, these models introduce significant
Many newly established fintech and nonbanks entering the credit market late in the cycle favor the use of AI and ML underwriting methods that focus on thin file and near-prime borrowers. Additionally, established banks face pressure to grow. And some are competing for the same types of higher-risk customers, possibly with misplaced confidence in credit decisions based on machine learning models.
This newfound enthusiasm surrounding AI/ML underwriting and the current expansion of credit late in its cycle should be cause for caution. Many AI/ML underwriting models are built using data cobbled together from various sources that are potentially biased towards a benign credit cycle. These models will likely underestimate actual credit defaults and institutions using such models will face significantly higher losses than their counterparts during the current market downturn.
One of the hardest lessons the financial sector is learning during the March 2020 market downturn — and from the 2008 financial crisis — is that credit models can deteriorate quickly, and borrowers with identical credit scores can perform dramatically differently based on when in the credit cycle a loan is originated.
These outcomes by credit score band are highly dependent on the economic conditions found in the development data set. This means that vintage analysis, which assess credit quality by loan origination date, provides the most reliable tool for understanding whether actual outcomes are aligned with expectations.
Vintage analysis also creates an early warning indicator if models begin to fail during the downturn of a credit cycle.
As seen in 2008, defaults became highly concentrated in vintages closest to the downturn, and economic deterioration spread from the housing sector to all parts of the economy, invalidating diversification assumptions. By the time most banks detected these modeling flaws, it was too late.
Model risk experts are already recognizing that many of the new AI/ML mortgage or credit card underwriting models do not capture certain risk factors, such as negative vehicle equity. Borrowers with negative car equities may have a tough time handling multiple credit obligations,
The credit bureau attributes that most banks use for non-automobile originations through AI/ML models do not provide negative automobile equities, which is leading to under predicting default rates in stress scenarios.
Model risk practitioners are also pointing out that many machine learning models are predicated on additional factors with brittle correlations calculated during good economic times. As a result, the ability of a model to rank-order customers can quickly disappear, or even completely break during economic downturns.
There is a modeling concept borrowed from biostatistics referred to as “heterogeneity,” which seeks to identify higher risk groups within the same credit band based on subtle factors.
If these factors are not modeled correctly, it will be difficult for risk managers to identify riskier segments among credit applicants.
Individuals that have these factors can be considered “frail” or more susceptible to environmental risks and will behave differently during a recession. Within credit modeling, these include indirect factors such as wealth accumulation, financial lifestyle, access to credit and other aspects of financial health that tend to be masked within machine learning models.
There are unproven claims by AI/ML providers that use alternative data and the sophistication of their new models that say it can find factors that explain the heterogeneity. Note that heterogeneity is significantly amplified during a credit downturn, as evidenced by the 2008 experience of rapidly worsening vintage quality.
Credit modeling performance based on past experiences indicates that predictive frailty factors are elusive, and must be humbly acknowledged as risk not captured in underwriting credit models.
Anticipating unobserved frailty factors, risk management teams should prepare for the next recession by creating an end-to-end approach. Such an approach needs to capture and effectively incorporate early warning signals from underwriting, portfolio management, watch listing and collections.
Risk managers should continue providing a healthy skepticism to the newest and shiniest machine learning credit models. Models that claim to avoid frailty by effectively discerning credit quality within credit bands should be approached with increased scrutiny. This includes demanding interpretable models to challenge the soundness of factors driving predictions.
All origination decisions made by credit models should be monitored so that vintage deterioration caused by model assumptions can be quickly detected. Banks should create a playbook that quickly adapts underwriting standards to mitigate failing model assumptions and ensure that this playbook can effectively put the brakes on poor quality originations once detected.
Ultimately, it is impossible to observe all the variables driving people’s ability to repay their loans. As such, all models fail to capture some risks.
The claims of AI/ML models using alternative data to select better customers in the near-prime segments are exaggerated and untested during a recession. During good times, loans are easier to repay as people have more stable incomes and can refinance through the almost limitless opportunities offered by a fully functioning credit market.
This also means it is hard to determine which models are good at making predictions and which models are bad during an expanding economy. It is a very different situation during this current market downturn, when causality matters. As
Model risk professionals during this March 2020 market downturn must look at the risk of credit underwriting models more holistically; not only the traditional model performance. They must also understand and manage those inherent, and sometime unpredictable, risks in the current market fallout.
Agus Sudjianto is an executive vice president and head of corporate model risk for Wells Fargo. Jacob Kosoff is the head of model risk management and validation at Regions Bank. Aaron Bridgers is a senior vice president and the head of risk testing optimization at Regions Bank.
The opinions expressed in the presentation are statements of the speaker’s opinion, are intended only for informational purposes, and are not formal opinions of, nor binding on Regions Bank, its parent company, Regions Financial Corporation and their subsidiaries, and any representation to the contrary is expressly disclaimed.