The promise of
Don't believe me? Let's look at a few instances. First, car loans —
At the end of the day, lenders are looking for deterministic factors to underwrite products — and that's what's going on here. I know all too well. I ran the product for the data science and decisioning team at Ondeck Capital and we looked at every data point we could get our hands on. And I mean it. Got a bad Yelp rating? It was accounted for in our model. Your FourSquare check-ins were down? Oh, we know. We even considered factors like seasonality in cash flow and how businesses in your neighborhood were doing. Our machine learning, or ML, models were designed to process thousands of data points to make lending decisions in seconds.
But I'm here to give you an alternate narrative. I think your AI models are fine (for now), but your data is fundamentally flawed. The issue isn't in the algorithms themselves, but in the historical data we're feeding them. You see, models are trained on datasets that literally go back decades. So, if a certain group has historically been denied loans at higher rates, ML models will subconsciously associate this with "high risk." The model doesn't know it's being unfair; it's simply learning from the patterns we've provided.
Consumer Financial Protection Bureau Director Rohit Chopra said the FICO credit-scoring model has drawbacks in price, predictiveness and market competition, and stakeholders should develop a more open-sourced model that uses artificial intelligence.
The problem is exacerbated by what we in the industry call "thin files" — credit reports with limited history. This disproportionately affects young adults and recent immigrants — arguably two groups most in need of access to credit. The alternative is to take on loans, often ones people cannot afford and on unfavorable terms, to build up credit, creating a Catch-22 situation that can trap people in a cycle of debt.
The impact of thin files on creditworthiness is staggering. According to a
So how do we solve this? I'd argue the future of our lending models needs to account for a more holistic picture to determine creditworthiness. We need to diversify our data sources, implement rigorous back-testing for biases and make our models as transparent as possible. Transparency should also extend to the consumer, by allowing them to understand the factors influencing their creditworthiness.
It shouldn't just be about smarter algorithms. It should be about smarter, fairer and more complete data. And on some level, it's about ensuring algorithmic accountability — and the ethical application of AI and ML in products that have a broad-reaching impact on society.