Systemic Fraud Risk Analysis

What are we looking for?

This is a statistical analysis, designed to show where there is systematic fraud.

What does this show?

This analysis is based on Benford's mathematical theory of leading digits, designed for analysing real-life datasets that are distributed in a non-uniform way. In essence, this predicts that numbers will start with 1 more often than they start with 2, and in turn this will occur more frequently than numbers beginning with 3, and so on, with probabilities as defined by:

In grey we show the expected distribution as predicted by Benford's Law, and highlighted, the observed data. We would not expect a perfect fit, but the larger the dataset, the closer it should match.

When it doesn't match, what does this mean?

If there is a significant anomaly, this points to some unusual activity skewing the data. For example, there may be incentives for transactions of a particular size or value, or there may be a set of fraudulent transactions - it is remarkably difficult to 'fake' transactions that fit the Benford distribution.

How is the analysis performed?

In this analysis all transactions are examined and placed into one of nine 'buckets', based on their first digits. These are then graphed, alongside the expected Benford curve, so any anomalies stand out.

Will this detail which transactions are fraudulent?

As a statistical approach, this can give an initial indication as to whether systematic fraud is likely to be found in the dataset, and seeing which digits are most anomalous may give a clue as to where to look next, but it is not possible to drill-down direct from this overview.