Corporate Crime and Investigations: How tech can help (or hinder) the fight against corporate crime

22 January 2024

It’s an exciting time to work in the field of corporate crime investigations and, more broadly, the disputes and contentious matters space. AI, machine learning, data analytics and other emerging technologies are turbo charging our ability to prevent, detect and investigate crime. However, these developments also present several risks and limitations that professionals in the field need to be alert to.

In this episode of our mini-series, an expert panel discusses how some of these emerging technologies are being applied in investigations, and how to mitigate the associated risks. Host Adam Jamieson, a partner in Ashurst’s Dispute Resolutions team, is joined by Charlotte Miller, Managing Director and Head of Financial Crime Digital at HSBC, along with Ashurst Risk Advisory partner and Chief Digital Officer, Tara Waters, and Ashurst partner and Data Analytics Lead, Matt Worsfold.

Together, the panel discusses current uses of generative AI, data security, data integrity, data visualisation, and more – then considers what to expect in the years ahead. Charlotte also explains HSBC’s supervised machine learning model that is taking transaction monitoring to the next level by replacing the use of fixed base binary rules with probability and propensity capabilities.

To follow this continuing mini-series about corporate crime and investigations, subscribe to Ashurst Legal Outlook on Apple Podcasts, Spotify or wherever you get your podcasts.

The information provided is not intended to be a comprehensive review of all developments in the law and practice, or to cover all aspects of those referred to. Listeners should take legal advice before applying it to specific issues or transactions.



Hello, and welcome to Ashurst's Corporate Crime and Investigations Podcast Series, a series where we discuss various aspects of corporate crime and investigations. As part of this podcast series we'll bring the discussion, debate, and insight shared in our 2023 Investigations Focus Events.

My name is Adam Jamieson and I'm a partner in the dispute resolution team, specialising in advising on internal investigations and regulatory investigations. And I'm pleased to be joined today by an illustrious panel.

Firstly, we've got Tara Waters, a partner, and our first chief digital officer and head of Ashurst Advance Digital, the technology arm of our firm. We also have with us Matt Worsfold, a partner in Ashurst Risk Advisory and the data and analytics lead. We're also delighted to be joined today by Charlotte Miller, a managing director and the head of Financial Crime Digital at HSBC. So welcome to all the speakers.

We're all very keen to kick off episode three, which is about exploring AI use in investigations. From the current landscape, the benefits and risks in the newest technologies, to deployment of data analytics, and how institutions are utilising these technologies in investigations.

So without further ado, perhaps I can start with you, Tara, and maybe you can just give us a bit of a flavour as to what the current landscape's like and what the recent developments have been in relation to AI in investigations.


Sure. Well, I think to start, the investigations space and perhaps more broadly, the disputes and contentious matters space, is an area that's relatively mature in terms of adoption of AI to help support the conduct of that type of work.

So there's a long history and a long list of well-established and mature vendors who have supported what has historically started out as eDiscovery software and eDiscovery work streams as we moved from paper-based data rooms and documentation into the digital and virtual world.

In the investigations space in particular, we've also seen the use of AI to help understand huge amounts of data, whether that be coming from documents or analysing and processing transactions happening across the world.

So I think it's an area that's relatively mature. We've got a lot of technology providers that service the space, but what's happened probably over the past 12 months has been a reinterest in AI through the launch of generative AI to the public. And that really happened with the launch of ChatGPT and OpenAI really bringing generative AI into the forefront, notwithstanding it's actually been around for a few years.

So what we've seen is a proliferation of existing vendors as well as new vendors starting to look at this new capability, generative AI, the adoption of large language models alongside existing proven models in the broader machine learning space. And now we're at a place where we're seeing some really interesting developments in terms of how generative AI technologies are effectively supercharging the way that investigations work can be done.


Thanks, Tara. I mean, as you say, it's such an exciting period because there's all these new technological advancements emerging. And whilst obviously we're interested in all the opportunities that that can bring, I mean, what do you think are the key risks that need to be mitigated in order for us to fully utilise AI in investigations?


Well, certainly there's always going to be the common and traditional risks relating to technology more broadly in terms of data security and really understanding exactly how the technology is being implemented, where it lives, what the hosting looks like, and how the data is being moved around. Certainly in an investigations context, the data that's being looked at, it's highly sensitive and confidential, and organisations want to be sure that they've got their arms wrapped around all of those security concerns.

But perhaps from a more practical perspective, generative AI in particular introduces some new risks that we haven't really had to deal with in terms of the other forms of AI that have been commonly used for several decades in fact. And I think the main thing being that it's generative AI. This is AI that actually creates, and it's designed to create very human-like reasonable, rational responses.

And it's quite easy if it's implemented in a very simplistic way for the outputs of generative AI technology to look very reasonable and human-like and to be mistaken for a complete response and answer.

And that's really where the risks start to come into play because we really need to make sure that we're using this technology not as a replacement for humans, but actually to supplement the way humans are doing the investigations work.

So that includes making sure that people that are starting to use technologies that implement generative AI understand the risks, understand how it needs to be used, know that the output is not the finished article, but actually is an input for them to then use their human minds to actually understand what's been surfaced up, to try to understand what that means in terms of the investigation and what decisions they need to take.

And the risk really is that the people using the technology don't understand that, they don't appreciate it, and that the technology potentially isn't implemented in a way that allows the workflows that would happen in a more traditional context where those second reviews and human eyes in the loop start to fade away. And it's really important that structurally the way in which work is done when you're using this type of technology, that all of those safeguards remain in place.


Matt, as the data guru, what are you seeing in terms of how data's being utilised at the moment to enhance the effectiveness and outcomes of investigations? And I guess the second part of that is, and what about AI? And how does AI fit into that picture?


Yeah, so it's worth noting, I mean, organisations are generating and store masses amount of data more than ever before. And with that comes some amazing opportunities for investigators in terms of being able to build a richer picture over a longer period of time.

For example, if an investigation's focusing on the conduct of a particular individual, you can start to piece together the actions, the conversations, the behaviours that are happening over a period of time, which in combination can provide really crucial context for that investigation.

But it also comes with some challenges, for example, in sourcing that information. So where do you get it from? How do you collect it? And how do you collect it in a way that upholds the integrity of an investigation? But more crucially, where do you focus the efforts when you are looking through that rich dataset? Because really, the more data that gets collected, the more chance you have of finding the needle, but the more hay there is to sift through.

And so really it's about being able to leverage the vast amounts of data, but also the range of different data sources that may sit in either structured systems or in the unstructured data world. So thinking about emails, chats, messages, texts, images, video, for example.

Organisations are also turning not only to their internal data, but also to external data sources, so anything that's kind of open source or publicly available. And that could be social media, that could also be company registries for example, to get, again, more richer pictures of the subjects of investigations.

But really, the main aim is how do you get the most relevant, highest-priority datasets into the hands of investigators to really focus the investigation so that it could be carried out in the most efficient way possible? And that's really where AI comes into play, not only from the ability to be able to analyse those variety of data sources that I mentioned, but also helping investigators shift away from what we would term rules-based analytics.

So historically, it would be about taking your dataset, identifying what to look for, so it could be keywords or standard expressions, and then running those through the dataset and reviewing everything that hits off the back of that. It's really about shifting that through the use of AI and ML into more dynamic analysis, so leveraging systems that are able to evolve what they look for, what they deem relevant and learn through the course of an investigation.

And that really relies on leveraging things like more sophisticated natural language processing. For example, being able to identify sentiment in correspondence between parties to provide these risk indicators. It could be identifying patterns of language that may appear unusual in the context of other communications. And again, that may indicate things like collusion. And also being able to map the relationships and interactions between actors in a conversation.

And these are all things that really would've been performed manually in the past, and that would've increased time, would've increased cost, it introduces the chance of human error as well and AI's helping us to eliminate a lot of these things.

It's also leveraging things like active learning, and that's about, again, those AI and ML systems learning throughout an ongoing investigation in order to be able to surface what it deems to be relevant information for the purposes of that investigation. And many of these platforms are utilising probabilistic scoring mechanisms, which is about not necessarily a binary outcome, is this relevant or not, but how relevant is that piece of information for that investigation based on what you're looking for?

And really, as investigators proceed through that investigation, it's about understanding, well, what is relevant? And feeding that back into the system so that it can learn, it can evolve, and again, start to prioritise what needs to be reviewed and where really to look from an investigator standpoint.

So the big shift with AI in the context of an investigation is really the embedding of it into readily available platforms, which many have these natural language processing capabilities built in or have things like prebuilt machine learning models that are available to be able to be leveraged out of the box. And really, that means then therefore a less reliance on technical data specialists and really easier access of these technologies and capabilities to investigation teams and ultimately democratising this AI capability to broader stakeholders.


Thanks, Matt. Some really exciting developments there for investigators.

Charlotte, so great to have you on the podcast today. We know that you've been involved in utilising new tech to fundamentally change the bank's core approach to financial crime detection. Could you give us a bit more detail about the technology that the bank have been utilising in this respect?


Yeah, sure. So thank you for inviting me, first off. So in early 2018, HSBC's Group Executive Committee made a commitment to continue to improve our fight against financial crime by investing in bleeding edge technology and being at the forefront of its utilisation and deployment across the industry.

And the most significant implementation of that has been around the utilisation of a product that's called Google AML AI, which we built in conjunction with our colleagues at Google Cloud platform and is now generally available. That has now replaced traditional rules-based transaction monitoring, so periodic lookback transaction monitoring, for over 80% of our global client base now to great effect. And we call that our Dynamic Risk Assessment, or the DRA. And the DRA essentially is, at its heart, a supervised machine learning model that looks at propensity for financial crime risk across our customer base. Given the compute power available to us on cloud, we can sometimes look at up to three years' worth of data for certain features or risk indicators, but anywhere between one, 18 months, two years' worth of transaction data are standard and across a full range of balanced features to assess that financial crime propensity.

A major business transformation as well around that, which I'm sure you'll ask me some more questions about, the amazing results really are such that compared to when we were running rules-based historical transaction monitoring, we've reduced our caseload by between 40 and 60% around the group, and our actual realised financial crime risk has increased nearly threefold. So the numbers really speak for themselves.

And that crystallised financial crime risk is across a number of topologies and indicators, so we are super excited and do believe that we're the first major bank that has completely removed the use of fixed base binary rules in replace of probabilistic scoring to great success.


That's really fascinating, Charlotte. I mean, I know that the challenges that there have been over the years with fixed base rule systems and whether they're too sensitive, whether they're not sensitive enough, and dealing with the resource issues that can flow from that to ensure that all the alerts are reviewed in a timely manner have created lots of challenges.

I mean, moving away from that type of system and utilising these new technologies, how are you overcoming some of the challenges associated with using AI instead in terms of explainability and accuracy?


Sure. Well, I think the first thing I'd say, and for any organisation that your listeners may be advising or supporting, or even if you're looking at it in your own industry and your own organisations, don't just treat this as a technology change programme. It's a major business transformation programme as well. You need to look at your control framework, you need to look at your culture, you need to look at your operating structure, you need to look at your training.

Alongside implementing the DRA, we've also completely changed our investigative operating model. We knew that we will be vastly reducing false positives, and we needed to therefore have less discounting of false positives going on manually to create more propensity for looking at more complicated high-priority cases to interpret that explainability that comes out of the model.

If you think about the volumes of data that I'm talking about for up to three years' worth of transaction history for 30 million clients in just HSBC UK, never mind 80% of the HSBC Group, that's not data that the human eye and brain can cope with easily.

So investing in data visualisations tools and developing them hand in hand with our investigators has been a really key part of the journey. Educating our risk stewards, our risk owners, and our regulators into the inherent risk-based approach that you're taking whenever you go down a probabilistic technology route rather than a deterministic technology route has been a huge part of what we've been doing for the last three, four years, and it's been a pleasure to be in the leadership seat for that programme.


Thanks, Charlotte, for sharing those experiences.

I mean, looking ahead to the future and what's coming down the track, which feels strange in a way because we've already seen so many developments so recently, but if we were to look ahead, Tara, three years, five years down the track at how AI is being utilised in investigations, what do you think it looks like?


I think we'll definitely see increased use of AI, different types of AI, including generative AI. From a technology perspective I mentioned there are a number of very mature providers in this space, and what we've seen proliferate over the past six to nine months have been a lot of new providers that have been very quickly able to get up really credible solutions attacking aspects of the investigation's overall workflow, really ably leveraging large language models. So I think we might see a shift in terms of the technology vendor ecosystem and potentially some new winners in that space.

Certainly we know the existing vendors are looking to increase their own generative AI capabilities, but the new providers are moving quite nimbly, so that's very exciting to see, and it's certainly an area that we at Ashurst have been following and trialling some of those new providers.

I think also one of the great things that generative AI in particular has brought to the table is the ability to move quickly. So what we're seeing is a number of organisations, including Ashurst, looking at building their own solutions.

So we've seen what HSBC has done relatively quickly over the past year with their new solution partnering with Google and the access to these large language models, many of which are in general freely available or available quite easily through APIs and integrations.

I think we're hopefully going to see a lot of organisations building their own solutions and not relying solely on third-party vendors, especially when you have considerations around data security, data protection, et cetera, the ability to more quickly leverage one of these LLMs and start to build your own bespoke and purpose-built solutions is definitely increasing, and I think hopefully we're going to hear over the next few years a number of other exciting new solutions being created.


Thanks, Tara. Well, I'm sure we could have talked about this topic for most of the day, but unfortunately that's all we've got time for this podcast. So many thanks to Charlotte, Tara, and Matt for joining me on this episode.

If any of our listeners want to get in touch with us, then our details are on the Ashurst website at ashurst.com. And if you'd like to learn more, look out for the next podcast in the series where we'll be exploring global enforcement trends, a roundup and horizon scan.

To ensure you don't miss any future episodes, do subscribe now on Apple Podcasts, Spotify, or your preferred podcast platform. While you're there, please leave us a rating or a review. We'll appreciate it. Thanks for listening.

Keep up to date

Listen to our podcasts on Apple Podcasts, Spotify or Google Podcasts, so you can take us on the go. Sign up to receive the latest legal developments, insights and news from Ashurst.

The information provided is not intended to be a comprehensive review of all developments in the law and practice, or to cover all aspects of those referred to. Listeners should take legal advice before applying it to specific issues or transactions.