Consumers might feel comfortable asking OpenAI's ChatGPT or Google's Bard how they should spend a bonus check. Asking that same question to most bank chatbots might not yield an answer so easily.
Most bank chatbots trail far behind what generative AI-powered assistants like ChatGPT or Bard can do, according to a study research firm Corporate Insight shared exclusively with American Banker. This finding, the firm suggests, should concern banks striving to meet evolving customer expectations to stay ahead of their peers.
"The average industry chatbot is not sophisticated. It [handles] really basic tasks, basic account servicing," said Grant Easterbrook, a consultant at Corporate Insight and one of the study's authors. ChatGPT and Bard can take that further by being able to explain financial concepts and pull knowledge of averages and trends in a way industry chatbots can't do, he added.
Testing the chatbots against ChatGPT and Bard
This summer, Corporate Insight tested ChatGPT (the GPT-4 version), Bard and six chatbots offered by leading financial services firms, including Bank of America, JPMorgan Chase, Charles Schwab, Capital One, Fidelity and U.S. Bank. The company asked each assistant — as well as ChatGPT and Bard — more than 100 questions, including requests to explain financial concepts, along with planning and budgeting queries.
Since banks do not yet make customer account data available to public-facing generative AI assistants, the comparison wasn't always apples to apples, but it was possible to approximate account tasks by asking ChatGPT or Bard how to do something while mentioning a specific institution name, Easterbrook said. One example would be "How can I find account alerts on the Bank of America website?"
Capabilities varied by bank, but overall, industry chatbots were able to handle basic account questions and tasks, including replacing lost cards, making money transfers and pulling up account information and queries relating to website or app navigation, the study found. Certain bank chatbots that were asked to make "relatively straightforward data pulls" frequently resulted in failures. These included, for example, instances where they were asked how much the client received in dividends, or what their five most recent trades were.
Few industry chatbots could handle information or account servicing requests within the chat window.
"With most industry chatbots right now, you ask a question and it will just give you a link to the page to go do it," said Easterbrook. "We're starting to see a few chatbots do things like actually letting you make a transfer within the chat, or actually making you lock a card within the chat because the chat is actually an assistant as opposed to kind of like a link tool."
Bank of America's Erica, for example, was able to handle servicing requests within the chat window and perform in-chat visualizations and data. Meanwhile, Fidelity's chatbot was able to easily navigate between different account types.
Generative AI's edge on financial advice
Bank chatbots were not able to handle questions that required pulling and analyzing data from external sources, thus limiting their capabilities to answer questions about financial concepts or queries that required tailored advice based on a client's overall financial situation.
For example, ChatGPT was able to provide a detailed answer to a question about how much the average American pays on car insurance a year. Generative AI tools also excelled at tasks that require some form of benchmarking. Bard was able to offer tailored advice on budgeting in response to a question about how a user can spend less on dining given the average amount spent per resident in a particular city.
Generative AI assistants also were able to handle queries that used informal language or had spelling errors, recognizing all of Corporate Insight's test questions that deliberately used slang and informal language. Among the industry chatbots tested, none recognized all of the test questions with informal language, errors or slang.
"When you think about the way that chatbot experiences exist today, the majority of them are very focused on answering simple questions. The majority of them are probably pretty rules based," said Tiffani Montez, principal analyst at Insider Intelligence. "When you think about generative AI, the reason that it seems to be better at explaining concepts is because it's conversational," with a large-language model ingesting and analyzing a broader trove of information.
The road ahead
For banks, a key hurdle to introducing generative AI into their chatbot experiences, said Montez, is data hygiene.
"How do you consolidate everything that you know about a consumer, whether it is in your system of record around their transactions or documents that you may have shared with them or any interactions that you've had with them?" she said.
Before rolling out such tools to users, banks also need to work out how to prevent instances where customer data authorized for a particular use case isn't inadvertently used elsewhere without user consent, said Jacob Morgan, principal analyst at Forrester Research.
For these and other reasons, the era of ChatGPT-style generative AI capabilities embedded into individuals' mobile banking apps is still far off, some say.
"It is very exciting. The possibilities are endless in servicing and potentially in advice, but [it's] not ready for prime time as much as it would be potentially in other sectors because of the sensitivity of the domain that we operate in," said Foteini Agrafioti, chief science officer at the Royal Bank of Canada and head of Borealis AI, the bank's dedicated AI research center.
On JPMorgan Chase's third-quarter earnings call, CEO Jamie Dimon debated with Wells Fargo Securities analyst Mike Mayo over whether recent advances in AI present an advantage to traditional banks or to their challengers.
RBC first launched AI assistant Nomi in 2017. Nomi offers customers account information and personal finance insights. Ask Nomi, an in-app chatbot, can handle basic account questions.
The RBC mobile app also offers asynchronous chat with human advisors.
On generative AI, RBC is testing how it can help employees assist clients. To that end, RBC is developing applications with large language models to support bank employees who need to consult a host of policies and procedures to advise clients. This offering is currently in a proof of concept and will be deployed later this year, said Agrafioti.
"We're deploying this right now in order to navigate policies and procedures so that people can query that corpus of data and get to the right answer very quickly," she said. "We find that that is a good, solid first step towards client servicing because it allows us to have human oversight, and that is very, very important to us."
In addition to the client servicing effort, the bank has also developed in-house large transaction models that can be used to make predictions across a range of products and services, including fraud detection, risk modeling and service personalization, she said.
Asked when generative AI tools would get into the hands of consumers, Agrafioti said the timing would depend on a number of challenges being addressed.
"We just have to make sure that it meets our standards of safety, and maybe the answer is that there's a line that says 'For these types of interactions, we're comfortable, but when it comes to that, we will never hand that off to an autonomous agent,'" she said. "We're trying to figure out where that line is."