In today's Digital Banking world, fraud and cyber threats abound, and Generative AI techniques are making the perpetrators craftier, faster, and more prevalent. In this workshop, you will learn how new architectures will help bring more data into focus, increase trust in data and AI, and create solutions that will NOT detect when fraud or cyber threats HAVE happened, but stop them BEFORE they cause any damage. Moving from detection to prevention is a real-time problem, a massive scale problem, and an AI problem. Today's modern architectures are ready to solve for three problems - and this session will walk you through how.
Transcription:
David Dichmann (00:10):
All right, so thanks. Hi, I'm David Dichmann. I'm here to talk to you about moving from fraud detection to fraud prevention with modern architectures. What are some of the secret ingredients that we're going to need to have an architecture that's going to be able to help us accomplish this task? First, I've got two slides that have our company brand on it. I'll get past them really quickly, and the rest of the content is meant to be generic. But at Cloudera, we're the only true hybrid open data platform for data analytics and ai. We believe that you can use data to make what seems impossible actually possible. We think data is the real secret ingredient to being able to do incredible things. But to do that, you need to empower people to transform all of your data wherever it may be, into something that's meaningful and useful.
(01:02):
And the way that we do that is delivering an open and scalable platform with cohesive data services that work together, portable across any cloud and on premises. And we've been in the banking industry for a long time. Most of our customers are highly regulated industries. We're an 82 out of the top, a hundred largest banks in the world, four out of five top stock exchanges, eight out of 10 top wealth management firms, and four out of the card networks. So we do a lot with large financial institutions, and we've been doing that for decades. But now when it comes to the cost of data breach, how many people here are worried about data breaches? Yeah, we all are. And it's one thing to lose the credibility with your customer. That's an important reason to be afraid of data breach. The other is the cost. So IBM did a survey and they saw that the global average cost of a single data breach is about $4.5 million. Every breach is going to cost you on average $4.5 million.
(02:07):
51% of organizations are going to increase their spending as a result of breach. How many people here have suffered through a breach? We've got a few. Yeah, it's not pleasant. And so it makes us want to spend more to stop that. And for organizations that have added AI into the equation, when looking to protect against data breaches, the average savings is about 1.76 million compared to organizations that don't. So obviously AI is going to be an important ingredient here. And this one here from McKinsey, I like this particular one. So financial crime. So now we're talking about all forms of financial crime, fraud, money laundering, that sort of thing, 2.1 trillion annually. Does anyone remember what it was? A couple years back? Two years ago, we gave this talk. I updated this slide from that. It was around 1 trillion. So the cost of financial crime has doubled in the last couple years. It's getting bigger, not smaller. So the problem's getting more complicated. But what McKinsey tells us is, Hey, for every dollar that's affected by fraud, we're spending three additional dollars dealing with it. So the cost of fraud is not just the fraud alone, it's the three other dollars in associated costs that go into dealing with the fraud.
(03:23):
So what we want to do is we want to get ahead of the game. We want to stop reacting to when cyber threats have occurred. We want to stop reacting to when fraud has occurred. We want to get ahead of the game and move from detection to prevention. Prevention is looking at things before they happen and say, that's going to be fraud. That's going to be an attack. Let's stop at dead in its tracks. But we have to do this in such a way that we don't stop the natural flow of business. So I got a couple of personal examples on this. I was at a conference in New York. I was attending some meetings and I got a text on my phone that said, Hey, was this transaction yours? It's from your hotel. I'm like, yeah, that was me. So I text back, yes, and I get the text back, says, okay, great.
(04:05):
Call the institution and have 'em rerun your card. I'm like, I'm in meetings. I can't just do that. So I finished my meetings. I called the hotel and they were thankful I called because they were just about to release my room because my deposit wasn't accepted. I could have lost my hotel for the night, but it was legitimately me. But the next morning my colleague comes to me, bill says, Hey Dave, I can't work the first hour while I'm here. PayPal just told me $500 is missing from my account. That wasn't him, but they let it go through. So we want to be able to stop those before they happen. But we can't do things like saying, okay, I see you're using a credit card. Come back in two days when I validated that you are you. So we have to let these things happen in real time.
(04:45):
We have to catch 'em, and we want to reduce the instances of false positives so we don't annoy our customers or let bad things happen. So one of the things that we find best practice organizations are starting to do is unify their crime fighting forces. So we see a lot of organizations that are doing fraud prevention. They also have separate organizations for money laundering, a separate organization for cybersecurity, another one doing internal surveillance, looking for bad actors from inside. How many people have these organizations? How many people have them acting as one? So one of the things we're seeing a lot of best practice organizations doing is unifying the data that's served between them all unifying the technologies, tools and processes across them all. So this is kind of like the super friends or the Avengers coming together. So we want to bring everyone together to fight this crime together, sharing that information.
(05:39):
So we need architectures that know how to share all this information. But there's some other challenges in fighting financial crime. So what we see when we talk to businesses is one of the first challenges is data is everywhere. If you need data to go and figure out a problem, how easy is it to get that? Where is that data stored? What other part of the organization is using it? And especially if we're trying to unify our fighting forces, we're going to find that that's a problem. Dependency on historical data. How many people here still have large volumes of data, historical data, likely found on premises, maybe not in the cloud, maybe we'll never go to the cloud.
(06:18):
We also want to move from events to behaviors. Events are when the things happen. Behaviors are what leads up to the event. So if we can start tracking behaviors, not just events, but that's going to be a lot more telemetry. We were talking to one bank, the amount of data that they would need to manage on the events when moving to behaviors was 1,500 times more. So they had to bring 1,500 times more data into the system in the same time process it, understand it and use it. That's a massive order of magnitude bigger. And that leads to the growing scale of data, more data, more sizes of data, more shapes of data, and we want to reduce that high rate of false positives. So there's some technical challenge that we need to overcome as well. We need to get to real time applications. We don't just want real time data ingest.
(07:13):
I think we've all figured out how we can get data to flow into our systems really quickly, but also being able to process that as it comes in, make business applications that use that data in real time, make our AI run in real time. That's going to be the challenge we need to overcome Trusted generative AI. How many people here trust generative AI? I don't see any hands. I'm not surprised at that. I shouldn't raise my hand. I don't trust it either. But we want to trust generative AI because we know from IBM, at least, there's some big power in using it in our crime fighting efforts. We got to converge our activities on the technology side and on the business side, we have to run data and analytics everywhere, not just in certain parts. We've got to understand our data, use our data and bring all that together.
(08:06):
But we also have to stay compliant. We need our security, we need our governance, we need our regulatory compliance. So we're going to have to overcome these. So the first thing I want to talk about is the real time advantage. Moving away from just bringing the data in as it happens, but actually using the data as it happens. So first we're going to capture the cyber data, right? We're going to capture what's going on. We can call it cyber threat data. We can call it fraud data. We can call it whatever we want. Transactions as they happen in real time, we capture them, but in real time as we capture, we also transform it into intelligence. So this is where we get to make those decisions. Does it look right? Does it not look right? Does as we see things not looking right, we can respond to those in real time.
(08:54):
Now, this is where we're seeing a lot of generative AI feeding this decision point. As the data comes in, we get that real time analytics. We do analytics on the data stream, put some AI in here and do that predictive model and say, that looks like it smells kind of funny. This is where we might see today the human in the loop. So if a subset of funny smelling transactions ends up in the human's queue to validate, we still get most of our transactions going through in the time of business, no problem, no delay. A few of them, there's a short delay. The human responds, and the fewer these we get, the more efficient our humans can be. And if we can put AI up in here too, we may need fewer and fewer humans in the loop. What we learn from that gets filtered back into the system, retrains the models.
(09:39):
And this cycle just continues over and over. So if we can bring in real time, analyze in real time, respond in real time, put in AI in real time, we're going to get much closer to prevention, not detection. And then we can talk about trust. We said we don't trust generative AI. I see three barriers to trust right now. The first barrier is about the data itself. I got a font problem there. I apologize for that. The first question we ask ourselves is, do I trust my proprietary data about my customer's transactions, about my business, about personally identifiable information? Do I trust this to the models? Now, you might be thinking, well, most of my generative, I doesn't need that data. But if we're talking about real-time fraud and real-time cyber prevention, you do need that data. How do you know if you're identifying people through the AI correctly, if they know what their identities are, this information is going to be part of the mix.
(10:40):
So the first thing we're going to do is look at true hybrid. We have on-premises, we can run models on-prem. We can run models in virtual private clouds. But when we have a hybrid environment, more than one cloud working together on premises, working together, we can operate where the data lies instead of having to move that data into less secure environments before we can operate on it. So if we decide, okay, I trust AI with my data, it's not going to leak and whisper my secrets to the cloud, the next question I have to ask myself, is my data ready to be useful for generative ai? Is the quality of my data there? Is the amount of my data there? Do I have the right quality? Do I have the right quantity? Do I have the right mix of data that's going to actually be useful to AI?
(11:27):
And here we're not talking about anything new are data strategy is important here. Quality security governance here, modern architectures like Lakehouse fabrics and meshes give us the tools that we need to be able to improve the quality of the data, make more of it available, make more of it prepared, treat data as a product that fits into this and helps us solve for that problem. So now we trust the models with our data. We trust our data is good for the models. The last barrier to trust is do we trust the models that are going to give us useful answers? And this is the really tricky one because I was talking to one of our banking customers and he was explaining to me that the kind of algorithms that they're using with LLMs, these large language models are able to do things that would take a hundred people a hundred years to do by hand.
(12:18):
So we can't check its work. We cannot manually check the outcome of some of these algorithms. So what they do is they use two LLMs and Cross Compare. They have ai, check ai, and if the model's returning more converged results, they trust them both and put one in production, use the other as a test to see if there's any model drift or anything over time. Or what they can do is swap out the models for more efficient models as they learn them. So openness is key on this one. As we go to enterprise ai, we don't want to just take one set of models, one set of tools, one set of technologies, one, be free and open to swap 'em out at will and have an architecture that allows us to basically have composable, enterprise ai, any model, any algorithm, any set of data, swap 'em in and out. Any cloud, any deployment, any on-premises needs to work everywhere.
(13:11):
So when we talk true hybrid, we're not talking about doing different things in different places. We're actually talking about doing the same thing in any cloud or on premises. The reason why we see this is the data centers where we see a lot of our customers keeping their most secured and controlled data. How many people here think that they can be completely in cloud in the next five years? I don't see a single hand, and this is normal. I should ask the question the other way and get people to move their arms a little bit. So the idea here is we know the data center is going to stay important, and we know the public cloud gives us a lot of extra flexibility, and it's also where a lot of gen AI is born. So we need both. We need the best of both worlds.
(13:53):
So what we're seeing folks do is starting in the bubble of trust with a virtual private cloud. It's my cloud. Everything that goes into there is mine. No one else gets to have it. And then we partner with the cloud service providers and use their models inside the VPC. Azure does this with OpenAI. Amazon does this with bedrock, and you're going to want more than one partner here. We highly recommend you don't just say, I'll do everything with Azure, because now you only have one set of models to cross compare with. You don't have that AI checking AI going for you, but that's okay. You want to use the models given by the cloud provider in this context because they'll give it to you in a way that anything you use with that model stays within your virtual private cloud. Any data that goes up in there is not being used to train its next generation.
(14:41):
So no one's training those models on your data, unlike some of the open ones, like straight up chat, GPT and then involve your data center. The data center is a big part of this. You can also bring models into the data center if you have true hybrid and portable services. The idea is if I train on-prem, I can run in cloud. If I train on cloud, I can run on-prem or I can train across both together at the same time and get cohesive results. So the real trick you want to do is, as we do all of this, is you want to have design for single security and control across all clouds. And on-premise, you don't want to have to be maintaining separate security, separate governance, separate traceability across all the clouds and on-prem.
(15:29):
Now we talk about architectures. How many here have heard of the open lake house architecture? So this is where we're basically taking the best of data warehousing, the best of data lakes and putting it all together. I have a technical word on here. I've been given a hard time sometimes for putting the word iceberg on here, but I am curious, how many people have heard have heard of Apache Iceberg? Yeah, this has surfaced up heavily. We see it in financial reports from some of our competitors where they'll mentioned it 18 times. Iceberg is a real game-changing technology. But what it fundamentally does is it says, if everyone just speaks iceberg, we can now all speak to our data the same way. If all of the different stages of our data life cycle from bringing data in to preparing the data, to analyzing it, to doing generative and predictive models on it, if we do that all against iceberg, we get a greater degree of cohesion.
(16:23):
And if we use the lake house architecture, we have significantly fewer silos and significantly less data duplication. A lot of the banks I talk to, they say, yeah, I heard Iceberg is good for data duplication. You're right, it is. I've talked to others who say, iceberg, it's great for reducing my surface area of risk. How many people here have more than one cloud data warehouse out there? More than 50, more than a hundred. Now, I don't mean to say anything negative about my competition, but we've seen a lot of news about data breaches, and I will admit, I'll be the first to admit it wasn't our competitor's fault, but it was the architecture's fault. When you have 70 or 80 points of failure, 70 or 80 data systems out there that have each their own independent security model, that's a lot to cover. So it's easy for some doors to be accidentally left open, but if you consolidate around a single architecture and a single data environment, you can reduce the surface area, the points of failure to as few as one, one security, one governance, one observability, one data observability, one way to move data from where it's needed to where it belongs across any cloud and on-premises, this is the architectural model people are moving toward.
(17:40):
And what we're finding is a lot of data states look like that. We have, let's say your market state under a cloud in AWS. You've got your online fraud system working in Azure. You've got some digital risk analysis going on in Google. You've got your on-premises data center. You've got your self-service in private clouds whose data environment looks a bit like this. So what we're seeing first is the way data fabrics work is within each one of these, we can connect all of these lake houses together and make a fabric, one security and one governance across it all, but it usually stays within the cloud. So what mesh does is it lets us build a single enterprise data mesh across all of the cloud. So if you externalize your metadata management and catalog, externalize your security and government, externalize your observability and data movement and data federation outside of each cloud, then you can look across all the clouds together and then you can have your data products from each one of these recognized by that catalog and shared to anyone, which cloud you implement on no longer matters. The data is now free to move about the enterprise.
(18:59):
And when we talk about enterprise ai, we see organizations needing to do three things. But I will stress again this notion of openness. Best practice is to build your enterprise AI practice around open standards, open technologies, even if you're going to run 'em all on premises, you want to have that openness because that allows you the flexibility to go to any vendor for any large language model, any vendor to any tool set, anything else you need to wrap around that you can get from anywhere and switch as the technology improves. We want to be able to be forward compatible to the next generation of LLM, the next generation of AI tool and the next generation of AI capability because we want to do three things with AI. We want to build AI. We want to be able to build our use cases quickly, experiment, fail early, fail off, and figure out what's going to work, what's not going to work When we're ready, we want to run those in a safe and secure way in our enterprise, move 'em into production as business applications powered by AI and across the entire ecosystem, we want to infuse AI into our development cycles.
(20:09):
We want to use AI assistance to help us write SQL. We want it to help us write applications. We want it to help us write code. So we want to use AI internally to build better and run better systems.
(20:28):
So when we talk about these three architectural dimensions, CIO is a CIO. Oh no, this is I-D-G-I-D-G did a survey of a bunch of CIOs and the first thing they found is that 93% of surveyed CIOs are adopting hybrid architecture. They're not just doing different things on-prem and cloud, but they're looking at a hybrid data state and keeping on-prem as a first class citizen in their data strategy. It's not the old place where we used to do stuff. It's a modern place where we're doing stuff Now. The second thing they found is that 40% are adopting these new modern architectures. The bulk of what we're seeing is starting with the Lakehouse paradigm, moving legacy lakes and legacy data warehouses into lakehouse architectures. So you get the best of both worlds and you centralize your analytics and looking at modern things like Apache Iceberg, which is a real catalyst to this growing.
(21:31):
But this also includes things like fabrics and meshes. And we've also seen 76% of organizations surveyed are either deploying or experimenting with generative ai. So this is obviously the big mega trend. So I got a couple of tips on what we're doing with generative ai, what we're seeing customers doing specifically around building models off of large language models. And I just want to do a quick deep dive into some of the things that we're doing to create a customized large language model. How many people here are working with large language models starting to experiment with them? So there's three things you ought to do to get the best out of them. The first is prompt engineering. This is just fancy for ask a better question. That doesn't mean force the users of the LLM to ask a better question. It means take the question from the user.
(22:27):
Don't just throw that over the wall, look at it and improve it with algorithms that you run yourself. Either surround it with more information, have it automatically create more information, but ask a better question. Then the next one is retrieval. Augmented generation. That's a bunch of fancy words for saying give it some upfront directions or data to use. What retrieval augmented generation basically says is, you are a large language model. You know everything you know about poodle grooming, but I want to talk to you about banking. So don't use the poodle grooming information when answering a banking question. Use this information instead and we can provide additional information. The advantage of this is you don't touch the LLM, but you collate additional information that's useful to you. Even your own proprietary information goes into a vector database, and that's sent along with the question. And basically the question becomes this answer this question, prioritizing this information I just gave you and no other previous knowledge.
(23:32):
Please answer the question. And then there's parameter efficient fine tuning, which is a way of saying, forget the whole LLM, let's use a piece of it, but we're going to actually train it further on our own and run our own version of the LLM in a smaller state. So this is what both prompt engineering, and is this just the prompt engineering? Let me look ahead of it. Yeah, this is about prompt engineering and a little retrieval augmented generation. So the prompt engineering is the user asks a question the old way with the follows, red line is just go to the open source and come up with a crazy response like, what's a good car loan for me? And it starts describing poodle grooming. That makes no sense. Instead, the user question goes into something that enhances the prompt. It goes to a knowledge base and makes sure, let's coach it with the right context.
(24:31):
So in my example, I might say we're only working with banking information. So when the person says, what's a good loan for me? Add to the question if I'm buying a car and this is the bank I'm at and this is the state I'm in, and anything else that's necessary to make a smarter answer. And it gets that because we pre-programmed a knowledge base and some additional vector db. Now this information also can include additional context about your offers, about your organization and that retrieval, augmented generation vectors, all included in what goes and is sent to the LLM and outcomes, a much better response. So that's the first way to get better with LLMs. The second thing we've learned is that in a lot of cases, smaller, large language models outperform massive large language models when tuned to a specific domain because the bulk of what's in those really large language models, up to 99% of it has nothing to do with the questions you want your chatbot answering or you want your algorithm responding to.
(25:42):
So sometimes it's better to use a smaller model, and this is where being able to swap in different models and train on-prem, train in cloud, train it on-prem with your sensitive data till you like what you see, and then run it in the cloud. That's where all this matters. And the last is about platform parameter efficient fine tuning, which is basically taking that massive large language model, but then saying, I'm only going to work on a subset of its expertise. It could be about natural language to SQL to assist our developers. It could be about detoxifying the dataset and only working with a set of healthy information that's going to provide responses that won't be offensive. It could be around instructions following a given data set. So you can provide, how do I do this or how do I do that on my site? And then we train it additionally on specifically those areas. So it's basically like taking raw clay, taking a piece of it, and then shaping that into something very useful and leaving the rest of the clay behind.
(26:51):
Why do we want to use these modern architectures for fraud and cyber threat? Well, the first thing is everything's going to go hybrid. This is the first thing. When we look at modern architectures, we need to take into account, everything's going to go hybrid because that's where all the data is. That's where it's going to be born. That's where we're going to want to use it. Look at those modern architectures, look at Lakehouse paradigm, look at fabric, look at mesh, and look at better ways for your organization to share data so we can get all those crime fighting teams to work together, not separately. And finally, keep your generative AI open. Be flexible to multiple providers for those models, multiple tool sets to be able to bring 'em in and out as needed. And let Gen AI check on Gen AI. And these are some results from our customers.
(27:35):
We've had reports of up to 25% increase in the rate of analytics. That's especially important as we're trying to get ahead. Instead of doing detection, we want prevention. So the faster we can do our analytics, the more in real time we can get to our analytics. 95% improvement in fraud capture rates by having better trained models, understand what looks good and what looks bad and producing far, far fewer false positives or false negatives. 95% improvement in fraud cap rates, 30% decrease in incidents and alerts. So less for the human in the loop to do so. We're finding fraud faster and we're sending less false alerts to the people who need to pay attention to them. 50% reduction in average daily dollar loss. And the benefit of all of this is we've also seen customers do all of that and find 10% reduction in cloud cost.
(28:30):
Hybrid's a big factor of that consolidation around open lakehouse using fabric and mesh is a big factor in that. And being able to use some of the LLMs on-prem versus cloud is a big factor in that. One last plug is I've talked a lot about architecture, a lot of technology side, but we strongly recommend that you don't do any technology side without considering the human dimension. McKinsey's got a great report. I just read this while preparing for this talk, the future of AI and banking from McKinsey. They've got this one here on providing an operating model in your enterprise and how to organize your people around the approach to scaling generative AI and banking. So I encourage you to read that as a follow on, get your tech right, but you got to get your people right too, and these two things will help you do that. So I'm here for any questions after the talk. I got 42 seconds left on the clock for my talk, but once the lights go out, I'll be right here at the front if you want to talk with me. Thank you so much for joining me today.
Modern Architectures for Fraud and Cyber Threat Prevention - NOT Detection
July 17, 2024 7:15 PM
29:37 Sponsored by