Q&A: Tackling Hidden Risks in Screening Programs
January 30, 2020
Here are the questions and answers from our attendees at our recent webinar on tackling hidden sanctions and watch list screening programs.
Q: How do you address false negatives and false positives?
A: False negatives is the hardest one because how do you know when you’ve missed something? So false positives is an easier question to deal with.
The issue always with false positives is the more you refine your screening to reduce your false positives, the greater your risk of having false negatives.
What I would suggest is every institution needs to determine the level of risk threshold that they are willing to take for different lines of business and different types of screening.
When it comes to sanctions screening, it’s critically important you don’t have a false negative. You are going to have more false positives.
For Politically Exposed Persons (PEP) screening, you are going to end up having false negatives, especially when you get to the spouses and family members of lower level PEPs. You’re going to miss them. And so the goal here is not perfection. While it that would be great, it’s not realistic, so it’s really a matter of looking for the balance between those two things.
Q: Can you give an example with the Greek names you were mentioning?
A: I think everyone is familiar with the idea that that computers run in ones and zeros. And so the way that those ones and zeros are converted into letters and numbers is through something called Unicode.
The Letters A and B in Greek are different than capital A in Latin. They have different numbers associated with them and the number is actually how the computer deals with that.
And so someone entering in a Greek keyboard enter this letter A, the computer wouldn’t see that as the Latin letter A, and that’s where some of this complexity comes in.
Q: There are names with as few as two characters. How do you name match for that?
A: There are names in many languages with a similar number of characters, like Mo or Jo. This is just a huge challenge with name matching. You have to do additional due diligence. You have to go and request additional information on the individual and their backgrounds. You may need to look up job records and those sorts of things, which are really going beyond normal name screening.
Note: There was a question from the audience as Japan is moving to change their name structuring in English documents so they can avoid some of the misunderstandings or differences in naming conventions. Traditionally it has been the last name first, but as of Jan. 1, 2020 this is being reversed when Japanese names are rendered in the latin alphabet.
Q: What text math algorithm should systems use?
A: There is no perfect one. I think that’s why the vendor screening space exists today. Alessa makes a great product but they obviously have competition in the market and I think one of the reasons that there hasn’t been a single vendor that’s completely dominant is that they each look at this problem in slightly different ways. There’s always going to be give-and-take between different matching technologies.
There is ideal technology for name matching. And the other thing to remember is that this isn’t even normal name matching. In most screening or search tools, the goal you are looking for is a match. So if you search Google your ideal is that you get hits. In the case of AML screening, the goal really is almost all cases should be no hits.
Your average reference database is somewhere below nine-million profiles, most have three or four million. There are seven billion people in the world, so really you should almost never get a hit. Because the goal is no hits, that ends up being a much harder thing to build and to get to work properly.
Q: What’s the difference between structured and unstructured data?
A: Structured data means that either an individual or a group of people or some sort of computer computational system has taken a bunch of data and put it into fields.
So think of World-Check, where there’s a last name in the last name field and the first name in the first name field and the date of birth and the date of birth field and so on. The data has already been parsed out into nice neat fields.
You can think of this like a spreadsheet where every column is a different field.
Unstructured data is like a news article. It’s just words, sentences, and within that unstructured data is meaning and useful data.
Q: How do you establish a fraud scheme in the sanctions screening portion of your potential fraud investigation?
A: Especially for smaller institutions, their fraud and AML teams may be closer together and larger institutions we often see fraud as part of the general counsel’s office. So, for fraud you are really looking at illicit behavior within your client base, whereas with AML, you are trying to understand who these people are and uncover what have they done.
With fraud teams you will have people that have often worked in law enforcement doing deep investigations. It’s sort of tangential to AML, but in a lot of cases considered a whole different department.
Q: How do blockchain and dark web watchlists factor into screening?
A: There are a number of dark web scraping tools that exist in the market that look for the names, addresses, dates of birth of people that are involved in dark web screening. Some of the vendors are including that in their data.
A lot of institutions are looking at crypto specific AML tools, crypto transaction monitoring tools. These are people who work specifically with FIs to help identify fraud and uncover risk in in cryptocurrency.
One of the things that we’ve seen in the news is that a lot of the traffic and the usage of crypto has been forged as a way to get people to think that there’s more going on in crypto than there actually is.
Q: Often, unstructured data through news articles are not available when they’re displayed as a source of a positive match in screening. Does that make it impossible to verify with available information on the client to validate the match?
A: I think structured data sources will always include the sources that they’ve used and those sources may not exist anymore. This is just a natural part of the internet some of these some of the services will cache the old versions and you can request access to those caches.
All of the data vendors will have in the ability to report back to them when you’ve got a link that’s broken and they’ll go do the due diligence to find a live link for you.
Q: Should you look at your data providers to ensure they are referencing credible sources?
A: I think most of them have a blacklist rather than a white list, but you want to understand what do they classify as reputable versus garbage.
One of the questions that comes up is blogs and how do you deal with that? Think of very high-risk countries where it can be difficult to get accurate news coverage – especially accurate political news coverage. Then you want to look at blogs. But how you validate those blogs becomes a difficult question.
Q: A number of data providers coming online at times offer what looks like an attractive price point, but you need to dig in and make sure that the lists or the information they’re providing will meet the needs of your organization. How do you deal with that?
A: If the list provider can’t give you a list their sources, and by sources I mean every link that they use, all of the sanctions lists, all of the regulatory enforcement, law enforcement lists where they get their PEP lists or how complete their PEP list coverage is, then you should really be suspect.
If they can’t tell you their PEP numbers by country, that’s their risk.
I think what I’ve seen a lot of with these sort of cheaper vendors is that they go very cheap, especially on things like PEP lists where they’ll pull data from the CIA world leaders list, which is not a PEP list. It’s a list of chiefs of state and country ministers. It doesn’t include legislators or anything else.
You’ll see the sort of thing where they say they cover PEPs and then you dig deeper and you realize all that they really cover is the CIA world leaders’ list. Or, they’ll say they cover sanctions, but it’s just OFAC. I wouldn’t completely dismiss a less expensive vendor, but I would put the onus on them to prove themselves.
Q: Is there research or statistics on how much is a normal false positive rate, like a percentage for AML screening programs?
A: The issue is that there are no two financial institutions that are built the same and so the rate that you should see would be different. If you’re a community bank in the Yukon Territory in Canada, you should be seeing a very different rate of sanction hits than if you’re Citibank, or even if you’re a bank say in South Texas or in Miami that’s dealing a lot with cross-border payments from Latin America.
This is going to be focused on these questions: Where does your business operate? What kinds of services does your FI provide? Where are you located? Where is your client base located? How much onboarding do you do of new customers versus existing customers?
If you are a bank versus an FI, that’s going to change your risk profile. There’s so many moving parts here that there really isn’t a one-size-fits-all answer.
The other thing to remember is that the higher you set your threshold for reducing false positives, the greater chance there is that you will miss something. There isn’t a sort of a benchmark of perfection because there is no perfection here. You want invest in tools that rely less on human intelligence, and more on sort of computational intelligence. Have people do things that people are best at and computers do things that computers are best at doing. The repetitious nature of it makes it a difficult thing for humans to do accurately and effectively.
Q: Can you test or fine-tune your system to reduce the number of false positives?
A: One thing that you can do is get a random sampling of pre-decision data – data that you know has been decisioned properly. Use a large sample of five to 10 thousand records that are decisioned properly and then change your configuration and run that data through again and see if it missed anything. Did it create false negatives? What was the risk of those false negatives?
You want to make sure that the data you pick is diverse across your institution. So you don’t use just high risk or low risk data. You want to get a good sample set across all of your lines of business.
Q: Are there any independent, or comparative analyses on effectiveness and reliability of screening programs software?
A: No. I think part of the reason why is that it would be very difficult to compare apples to apples. It’s not an absolute score. If CaseWare gives you 83 percent match and Bob’s screening software gives you an 87 percent match, those two numbers aren’t comparable across different software. They are only relevant within that piece of software to compare different hits and so things like thresholds are very dependent on a specific screening tool.
It’s not like toasters where you just say like set it to medium and see what kind of toast I get.
Q: Do you find a difference in the approach to domestic PEP hits vs foreign PEPs?
A: The approach to PEPs should be country-by-country. While FATF guidelines for PEPs are generally accepted, there are significant differences regarding PEP definitions. In Canada, for example, local PEPs include any leader of any city, town, or village, regardless of size. In most other jurisdictions, the limit is set to 100,000 people or more.
In Mexico, candidates for office are considered PEPs until the election is over. One of the challenges with building a PEP list for an FI is staying on top of all of these changing rules and regulations.
Q: Is it common practice that supervisory authorities requires extremely detailed investigations and an audit trail for false positive alerts?
A: You should work with your regulator to understand and identify their requirements for investigations. Regulators are usually very open to answering these questions.