The Risk and the Return of Generative AI

As banks experiment with myriad use cases for generative AI, they have to try to predict where they might get a return on investment from using the technology, and at the same time anticipate and mitigate a multitude of risks large language models introduce, from the potential for hallucinations and bias to the danger of not disclosing AI use properly or failing to comply with banking regulations. This session delves into the ways banks like PNC are exploring the use of generative AI in sales and marketing, customer service, operations, document analysis and other areas while being mindful of the need to use the technology responsibly and ethically.


Transcription:

Penny Crosman (00:10):
All right. Welcome to this panel. We have two experts here, Anuj Shah overseas AI within PNC. He's constantly looking at risk and reward a lot of the things that Satish talked about earlier and Prabhu Ramamoorthy is at Nvidia, which obviously is having quite a moment. I think the stock price has risen 4000% in the last five years.

Prabhu Ramamoorthy (00:40):
Thank you Penny. And we have become the most valuable company, right? We have also overshot Microsoft, so this is the first conference since then.

Penny Crosman (00:48):
Yeah. So one thing we want to kind of talk through some of the risk and reward kind of continuing some of the things that Satish was talking about. So he talked about eliminating fear and just going for it that the risk of not doing something is greater than the risk of doing something conceptually. Do you both, well I'm sure you agree with that, but conceptually do you agree with that?

Anuj Shah (01:13):
I agree that we need to continue to innovate because if we don't innovate, we're going to end up bringing laggards in the industry. What we are doing internally is we're prioritizing our scarce set of resources across all of our domains. I think the one thing that Satish said that really resonated with me was that bringing a lot of the control groups into the process and that helps us accelerate. So we have this concept called whole of bank where we bring not only our business, our technology groups, but our operations groups, all of our control functions into the process so that we're all assessing a lot of these use cases and prioritizing them together. And what we're doing is we're prioritizing those use cases. So we're focusing our scarce set of resources on our highest value opportunities.

Penny Crosman (01:58):
And what are you seeing? Are you seeing people be fearless among your clients or?

Prabhu Ramamoorthy (02:02):
So from Nvidia, and I wanted to level set, we have already been in quant finance. We are at 300 plus banks. We work with partners such as Murex. So we are 300 plus banks already. And we are also on predictive analytics. For those of you who are trying to migrate, we do Python on GPUs, like Python on steroids, so we can accelerate your conventional predictive analytics. And lastly, coming to generative AI and unstructured data, we are a must. So we are in all the three workloads, including the last workload, which has taken off in a big time. And we have had great success facilitating that, including seeing those use cases in banks and production. So I do agree, but the proof is finally in the output that the end clients are able to use it and we have to get generative AI to that stage.

Penny Crosman (03:00):
And you guys are working with BNY Mellon. Can you say anything about that relationship?

Prabhu Ramamoorthy (03:04):
Definitely. I mean, so what is key right now is BNY Mellon bought something called DGX Cloud, DGX systems. And how these systems work is in order to train these generative AI large language models, you need a number of both the software. So Nvidia is the most valuable company because it's not only about the hardware now, it's also about the software. Think about as Apple for the enterprise. So with the DGX systems, we are able to deliver that value where we are both in the hardware and the software. So as a result, the end customer is able to train their own models as powerful as open AI models. So that's how it's very key and that relationship is very key for us. Yeah.

Penny Crosman (03:53):
So Anuj, you mentioned that you're focusing on the things where you see the most value. Can you give a few examples and how much value do you have to see? Do you have to be able to see a return on investment within a year or something? Do you have some concrete guidelines like that?

Anuj Shah (04:11):
So we're doing use cases that are probably pretty consistent across the industry. So we're doing internal productivity use cases and we're doing developer productivity use cases. So from an internal productivity perspective, we're looking at high value, low risk use cases using human in the loop. So across our frontline we have our frontline employees currently interact with customers on a day-to-day basis, assessing and delivering against their financial needs. And in order to do that, oftentimes they'll get complex queries that they then have to go and research how to do that. So they research how to do that by querying, we have 11,000 documents that they have to go through. So it's a time consuming process where customers are either on the phone or in the branch. What we are doing is we're putting together a virtual assistant using a lot of the models that we're talked about here.

(05:06):
And what we want to do is make that process more streamlined. So we not only see the financial benefit, but I think the two pieces that we often understate as part of this, we look at it a little bit more holistically, is we look at the financial benefit, we also look at the customer experience implication and the risk implication. So from a customer experience perspective, if we can shorten the time for customers while they're waiting to get their request serviced, it's a better customer experience. We increase customer loyalty from a risk perspective. If we can answer questions in a more consistent manner across our entire frontline, which is about half our organization, I mean that's a pretty significant risk reduction as well.

Penny Crosman (05:47):
So do you think having the AI model do that kind of research could actually be more accurate than a human? Is that what you're finding?

Anuj Shah (05:54):
We're finding that we can make the answers more consistent. And that's really the key is that we want to make sure that we're answering them not only accurately but consistently across the organization.

Penny Crosman (06:05):
And of course the consistency and the accuracy depend on the data that gets fed into the models. Can you say anything about how you are, and you're using open AI's technology, right? Can you say anything about how you're training it, the kinds of data you're inputting into the model?

Anuj Shah (06:23):
So in this particular use case that I referenced, we're using our own internal data and we want to ground the model to answer from our data. We don't want to go and use the Internet's data to answer questions. So we're training in on that. And what we have found is data is a key input into this. If we feed garbage data into the process, we'll get garbage answers out of it. So we're extending a lot of our data governance processes that we've had across our critical data elements to this type of data. So now we need to think about data more holistically than just data that's used for financial statements, for example. So we're developing the guardrails associated to it, the review processes associated to it, the governance, the monitoring, all of those things that existed within traditional ai. We've now extended to this new set of unstructured data that we're moving in with the generative world on.

Penny Crosman (07:17):
So for a customer, it would probably include account history, previous products that they purchased in the past and things of that nature.

Anuj Shah (07:27):
At this point, we're not using PII we're using our policy and procedure documents to reduce the risk of our initial set of use cases.

Penny Crosman (07:36):
I see. And across the industry, what are you seeing as some of the most successful use cases for AI and generative AI? And especially where are you seeing people get the most value, the most return on investment?

Prabhu Ramamoorthy (07:51):
Yeah, thank you. We see a lot of value added use cases. I'm a chartered financial analyst, FRM by background. So going back to AJ's point, there are three buckets that we categorize these use cases on. First is financial revenue, one that is directly giving you more revenue to the company. And in this area we see areas like Bloomberg, GPD, right? Alternative data. You are able to generate insights using alternative data. You're able to look at unstructured data. By using structured data, you're not able to gain more asymmetry, right? Everybody has that same piece of data. So we see that first use case financial revenue where alternative data, alternative data based underwriting is doing that. The next bucket that we group is something called operations under which know your customer, anti-man laundering and fraud as well as intelligent automation fall. And that's the next bucket where it is helping organizations use that, reduce the fines or automate the process and get productivity improvement.

(08:57):
And the last is we call it customer experience and employee experience, which is the customer experience use case. Satish did a great job speaking about it. So we see a whole lot of use cases across all these three spectrums. And you would've to expect the stakeholder to go with where the ROI is, right? It's key that business supports the use case and the ROI return on interest, return on that value is provided out of those use cases. And we see approaching it from all the angles you can look at from the CXX angle as well as from the ops angle as well as revenue angle.

Penny Crosman (09:36):
Alright, so it seems like we're seeing a shift away from the let's try whatever we can do with this, play with it wherever we can into a much more calculated, let's find where this makes the most sense and where we're going to return and where we'll see some kind of solid value.

Prabhu Ramamoorthy (09:54):
The underlying thing is I, I'm a developer by background. You had to fight with these table models, credit card underwriting, you have to get all the data scientists, you had to buy the data, you had to underwrite your model, whatever it is in India, had to get it to that accuracy. Think of LLMs as the same thing, right? You're just replacing some of those predictive models, but with these newer age models, right? LAMA three and all that, but you still would've to get to that end use case with that accuracy, right? That's needed to go to production, otherwise it'll be stuck into research.

Penny Crosman (10:27):
And I think in news, you were saying earlier that you used to think in terms of use cases and now you no longer really think of use cases. How do you think of it?

Anuj Shah (10:37):
So what we're seeing is there's broad applicability. So when we started this initial set of use cases, we realized we're actually building capabilities. And what I mean by capability is it's an architecture that we can then extend to other types of use cases and if we do them consistently across the organization, so in this use case that I referenced earlier, summarization was a key capability that we developed out of this. Now we're finding that summarization the need to summarize information that's being input into the process. The need to summarize that is a consistent capability that we need across the organization from our frontline to our middle and back office is a huge capability that we're starting to develop. Another one is ad hoc data extraction. So we have information that comes in through various forms across the organization. It can come in through email, it can come in through PDFs, it can come in through paper documents. The need to actually look at different formats across the organization and extract information and actually take action upon it is another one. We have ad hoc analytics that I think you referenced. And then there's intelligent routing, right? We want to make sure that customer requests or employee requests get to the right location. We've been doing that for years, but now with this generative AI technology, we can now actually route it much more intelligently.

Penny Crosman (11:57):
So it's taking what the customer says in a voice response system and then determining this person needs to go to fraud or this person needs to go to credit card or

Anuj Shah (12:06):
It's across every channel. So voice is one channel. We take that information, digitize that, and then we have the ability to route it. It's across email, it's across our mobile channels. And what we're seeing is also a lot of interplay between these where if we have a customer calling in for a particular capability, we offer them self-service capabilities because we want to make sure that we're routing the customer to the most appropriate locations. We're also seeing interplay between the different channels as well.

Penny Crosman (12:35):
Is there ever a danger that the AI might not recognize the most important points as it's summarizing that it could miss important context or something dramatically changed over last year and it might not know that because it doesn't understand what happened last year?

Anuj Shah (12:56):
Yeah, so there's a couple of different components to this. So one is we're using human in the loop in every generative AI use case that we're using. So we want to make sure that there's a human control and a human review to avoid that exact situation. I think the second piece is we're enhancing our model risk management process in the age of generative AI, traditionally our model risk management process, you had a known set of inputs, a known set of outputs, and you can work through that process to understand what accuracy level you were going towards. We're in the age now where the set of inputs is the human language. So you have an infinite set of inputs and as a result of that small tweaks to the way you ask a question can actually have big implications to the output. So we're going through and revamping our model risk management process to take that into account. I'd say that's probably where we're spending the bulk of our calories right now is to make sure that we get an understanding of how do we go through that testing process, how do we build the associated guardrails around it, especially as we look to deploy to production, how do we build the appropriate monitoring, et cetera.

Penny Crosman (14:10):
And so when you think about the value of all of the summarization that you're doing, is it basically time savings? So it's cost savings because it's taking less time to,

Anuj Shah (14:21):
I mean there's definitely a financial implication to it. I wouldn't understate the customer experience impact, right? Customers, if you think about when you call into a call center, the last thing you want to do is wait on hold while a representative is going and doing research. If we can shorten that time, we feel like we can improve the customer experience pretty dramatically. And as a result of that, we'd increased customer loyalty.

Prabhu Ramamoorthy (14:45):
I do think Anuj brought out a key point. I'm a use case guy. So I think when you think about it in terms of capability, it gets embedded. So you're making it a part of the technology stack, then you're making it mapping everything. So that's a very unique view. I never thought of that use case versus capability. And going back Penny, I think the accuracy, the value add to the customer is not only the time savings, ultimately AI is a model. I think this model has to be accurate and if it is accurate enough, it'll help you with those use cases. Either financial revenue trading, back testing. So we work with a lot of customers. So you have to show that accuracy and it is able to not only just generate save times, but it is also helping you make models monetize off on it or save fraud.

(15:43):
For example, I know the bank up north there is a lot of know your customer anti money laundering area. That was the area where you generated a lot of false positives. The traditional rules-based players, you could generate 10,000 alerts and an analyst would be human in the loop, would look at it. If you had an anomaly reduction model with ai, which can look at trade finance docs, that is the false solids that saves a lot of time, not only the effort, but it also helps you to tackle the real fraud, right? You can actually capture the drug launderers and all that.

Penny Crosman (16:19):
So we talked a little bit about how you're using generative AI to help your customer service people to come up with responses, answer questions, et cetera. Are you thinking about using generative AI to interact directly with customers or is that something you would feel cautious about doing?

Anuj Shah (16:38):
I think at this point we're cautious, right? What we want to do is we want to tackle our low risk, high value use cases, first of which we have a backlog to work through at the current point. As we get more comfortable with it, it's something that we can explore over time.

Prabhu Ramamoorthy (16:51):
People who don't put it to the front customer, they put it on the backend, we call it customer experience, employee experience. You could actually make your own employees use it so it's not visible to the end customer, but it's rather like it's used for an internal organizational productivity use cases.

Anuj Shah (17:07):
Exactly.

Penny Crosman (17:09):
And what about in the branch? Does generative AI have a place in the branches?

Anuj Shah (17:16):
So the example that I provided in terms of our frontline, so that would include both our care center and our branch. So our branch, if a customer walks into the branch and they have a question, the same type of process is used, they go and they query the 11,000 documents to go and get answers to those questions. And we want to make sure we're streamlining and making that process as seamless as possible.

Penny Crosman (17:37):
And you've started this or you're planning to?

Anuj Shah (17:40):
We're in the proof of concept stage right now. Yeah.

Penny Crosman (17:42):
Okay. How about in sales and marketing? Are either of you seeing a lot of use cases in that area?

Prabhu Ramamoorthy (17:50):
Sales and marketing, I mean I let an go ahead a bit, but I think you can consider that they are away from model risk management financial institutions. So I kind of think in a bank, right? It's a regulated world. The other hedge fund space doesn't sit in a regulated world. Asset managers don't sit in. So in a bank you could look at those use cases which are away from regulation a little bit. So you have to be creative with the use cases. So that could be a potential area where it doesn't go into the CCAR model validation space, Azure thoughts as well. Yeah,

Anuj Shah (18:23):
So we've used AI in our sales and marketing process for quite some time. So we want to make sure that we're offering the right customer the right product at the right time, and we want to empower our frontline to give this information to them so they're having the appropriate conversation with customers. We feel like through the advent of generative AI, we can now enhance that process with additional data points so that we're continuing to make sure that the customer is getting the best from a PNC perspective. I think the other aspect to this is it also helps our frontline understand which customer requests are the most important. So we're able to sift through a lot of different queries that come in. We get queries through email queries, through voice, and we're able to go through a lot of that information and then sift, this customer has an urgent need. We want to make sure we're prioritizing that to the top. Or if we see a request come in through our care center, we're able to track the sentiment of a customer and we want to make sure that we're proactively looking at customer sentiment and proactively addressing customer needs if there's something that's faulty in that process. So

Penny Crosman (19:37):
AI can tell this person's angry, this person's frustrated, this person is okay, you can let them wait or whatever. That's right. Alright, in 30 seconds we're going to bring mics out, so if you guys want to ask your own questions you can, but I just wanted to ask mainly, and this is mainly for Prabhu, a lot of people, well we've been writing a little bit, some people are concerned about the energy consumption of AI models because it does escalate energy consumption. There's obviously there's data centers being built and there's power plants being built to accommodate this need. Is that something you worry about and is that something that NVIDIA's addressing?

Prabhu Ramamoorthy (20:19):
Nvidia has been leading it from the front, our form of computing, which is with GPUs, which is a heterogeneous compute, it's actually been saving energy. So you are able to do whatever you did with the CPU, you're able to do it better with the GPU, obviously we started, if you go back to the sixties, we started with the vacuum tubes computer. So the amount of compute has been going up, the amount of compute energy and that is going to go up. But the idea is to, within that timeframe, within that compute constraint, we are delivering the maximum value, including energy and power savings. For example, in credit valuation adjustments, which is a calculation that takes a hundred nodes. With CPU compute, we are able to do it with four nodes, with eight GPUs each. So you don't spin up a cluster like a hundred machine cluster.

(21:16):
So we are able to do that with four clusters. So penny, that would be right. So we deliver the maximum power efficiency per that unit function. Obviously the amount of compute that is consumed and we are starting to process on data that's going to go up. So hopefully, and you would also see that Nvidia is releasing new processors called Grace, upper Grace Blackwell. So you have 10 terabytes of memory, the CPU is unified with the GPU, you can train what takes an open AI model, 5,000 GPUs, you can do it to find a GPUs and these would be available to the customer. And you could also see those power benefits. Did that help? I mean I know I was addressing it indirectly, but the compute, as nature is going up in life within that spectrum, we are trying to deliver the best value.

Penny Crosman (22:12):
Alright, we have just a few minutes. If you have a question, can you raise your hand or stand up? We've got a question here. Drew is running. Thank you.

Audience Member 1 (22:25):
Hi, my question is for Anuj. So I don't bank with PNC, but the experience that I have calling into a call center for banking and other financial services has been remains to be fairly frustrating. And for the people that I know as well where we're frequently saying representative, representative. So my question for you is two parts. One, as a senior executive for a major financial institution, when you do your banking and you have a unique situation, you need to get something resolved. Do you use your own, do you call into the systems or can you just call somebody that internally and get it resolved? And the second part of my question is what does that feedback loop from the customers who experience these call centers look like for your organization?

Anuj Shah (23:30):
So I typically try to call in, I want to know what the process looks like and how it continues to be enhanced. And I do that across a number of financial institutions so I can get best practices. In terms of the feedback loop, we're establishing both formal and informal feedback loops. So what we want to do is we are capturing information as we deploy our tools, we're capturing information from the front line on whether responses are accurate or not. And if they're not accurate, we want data associated to why it's inaccurate and we're establishing a team to look at that information so that we continually fine tune the model so that it gets more accurate over time. I'd say the other piece is we're also looking for information across all of our channels. So even though someone's interacting with us digitally, if they're calling us in our call center, we want to make sure that we can link the two pieces together and we can really triage this is actually related to something that they did in a different channel. And we want to make sure that we're thinking about it from a customer journey perspective, not channel, channel independent.

Prabhu Ramamoorthy (24:39):
Alright, we have a little more time ago. One more question.

Audience Member 2 (24:43):
This question is also for Anuj and it's sort of tugging at that same thread and I just want to thank you for sharing PNCs perspective here. I think it's very thoughtful, but I think that use case was talking about the customer facing applications of generative AI and you mentioned that you guys haven't sort of leaned into those use cases as hard yet. Would you mind just sharing what do you sort of see as those inhibitors and are they self-imposed, like you're starting with more low to moderate risk use cases first, then sort of moving into customer facing? Or do you feel like there's a regulatory barrier or what would maybe cross that chasm for you guys to actually apply it to customer facing use cases?

Anuj Shah (25:33):
I think at this point we're trying to establish a foundation of which to build on. So we want to make sure that one, the organization gets comfortable with it. There's a lot of learning that we have across the entire spectrum, both from a technology perspective, from a control group perspective. We really also need to understand the needs of the organization where we have talent that we need to upscale to fulfill net new functions that we haven't even created yet. So a little bit of it is just establishing the foundation to which we build on. I think the other piece is we need to continue to focus on the accuracy levels analogous to what we have with our traditional AI processes. We have known inputs outputs, we have accuracy levels, we have monitoring processes, we have the risk and control framework. All of those pieces we're using for our initial set of use cases in a low risk, high value way, so that we're able to prove to both internal and external constituents that were taking it into a risk prudent manner.

Prabhu Ramamoorthy (26:36):
I can add to that accuracy is not there so that they have not gone to their use cases and that's why NVIDIA helps. So the accuracy, if you have that, it can be taken to a lot more use cases. Yeah.

Penny Crosman (26:47):
Alright we actually have time for one or two more.

Audience Member 3 (27:01):
Yeah. Hi, my question's for Anuj as well. I'm just wondering if you could share what accuracy levels you are getting and whether you have concerns about the large language model hallucinating to your internal representatives and then them passing that false information on, for example, to a client and whether, I guess, how you deal with that.

Anuj Shah (27:31):
So we have a couple of different variations of the use case that I mentioned that we're deploying and we're seeing accuracy levels vary depending on which variation we're talking about. So in more focused use cases, we see the accuracy levels pretty high and we're working through a lot of the iterations. The more we broaden this, we're seeing accuracy levels decline over time. So that's something that we're working through right now. What we're not seeing yet is hallucinations and hallucinations in the strictest form, what we're seeing is individuals asking questions and they're not getting the responses back that they expect. And we're finding that there's a whole host of reasons associated to that. So one is the data that went into the process was inaccurate and as a result of that we need to correct that. An interesting one that we found recently is how questions are being asked and the model interpretation of it. So one of the use cases that we're working through is the model interprets the words should and must in a similar manner, but should and must, especially as you think about it for a regulatory use case, have very different meanings. Should is best practice must is a regulatory requirement and we can't interpret those two things the same. So a lot of it is just working through a lot of the nuances associated to that and continuing to tune the model for a lot of these variations.

Penny Crosman (29:01):
Alright, I think we have time for one more. I think we had a question here.

Audience Member 4 (29:11):
Yeah, my question is for Anuj, I am a PNC customer and thank you utilize you guys for a HELOC during covid to expand the house a little bit. So thank you for that. You mentioned evaluating use cases and working on those that are high value, low risk. Can you unpack a little bit about your internal rubric or scoring model that determines of the ranking of those use cases and how that gets communicated to the different stakeholders? So there's alignment.

Anuj Shah (29:48):
So I think there's a few different pieces that we're trying to work through with that. There's the value associated to the use case and we have a methodology to do the valuation. I think the valuation I talked about is financial customer experience and risk. There's a set of technology considerations that we need to think about as well. IE, do we have the ability to do this today? Or is this technology that we see emerging on the horizon that we want to see how it plays out? Do we have the architectural considerations associated to it? Is it something that we're comfortable with from a customer capability perspective, compliance perspective, risk perspective, et cetera. So there's a couple of different dimensions that we score on and we take the aggregate score associated to that.

Penny Crosman (30:35):
Alright, well unfortunately we're out of time. Thank you so much to our panelists. Interesting discussion.