Humans are remarkably good at reading documents and understanding context. They are remarkably bad at scanning thousands of transactions for subtle patterns. A bookkeeper processing 500 transactions will catch an obviously wrong amount. They will not notice that the same supplier has been paid twice with invoice numbers that differ by a single digit, or that a vendor's bank details changed two days before a large payment, or that expense claims follow a pattern that suggests systematic manipulation.
AI anomaly detection is built for exactly these tasks. It excels at the pattern recognition that human cognition is not designed for.
What Anomaly Detection Means in Accounting
Anomaly detection is the identification of data points that deviate from expected patterns. In accounting, these deviations can indicate errors, fraud, or process failures. The key categories include:
Duplicate payments. The same invoice paid twice, sometimes with slightly different references. The Association of Financial Professionals estimates that duplicate payments affect 0.1% to 0.5% of all invoices, but at scale this represents significant money. Businesses processing 10,000 invoices annually might make 10-50 duplicate payments without detection.
Unusual transaction patterns. A sudden increase in expenses from a particular category. Payments to a new vendor at unusual times. Transactions that fall just below approval thresholds, suggesting someone is splitting larger payments to avoid oversight.
Outlier amounts. A utility bill that is three times the normal amount. A supplier invoice that is 20% higher than the contracted rate. An expense claim for EUR 499 when the approval-free limit is EUR 500.
Timing anomalies. Invoices dated on weekends or holidays. Payments processed at unusual hours. Month-end spikes in expense claims. These patterns do not necessarily indicate fraud, but they warrant investigation.
Vendor fraud patterns. Ghost vendors (fictitious suppliers created to divert payments), vendor master changes (bank details altered before a payment run), and vendor relationships (a vendor sharing an address or bank account with an employee).
Statistical Methods vs Machine Learning
Traditional anomaly detection in accounting uses statistical methods. Benford's Law analysis examines whether the first digits of transaction amounts follow the expected logarithmic distribution. Deviations from Benford's Law can indicate fabricated or manipulated numbers. The technique has been used in forensic accounting for decades and was notably applied in detecting fraud in the Enron scandal.
Z-score analysis identifies transactions whose amounts fall more than a specified number of standard deviations from the mean for that category. If the average monthly telecoms expense is EUR 250 with a standard deviation of EUR 30, a bill of EUR 450 (6.7 standard deviations from the mean) triggers an alert.
These statistical methods work but have limitations. They require predefined rules and thresholds. They treat each transaction in isolation rather than understanding relationships between transactions. They cannot learn or adapt without manual recalibration.
Machine learning-based anomaly detection addresses these limitations through two fundamentally different approaches.
Supervised Learning: Learning from Known Fraud
Supervised anomaly detection models are trained on historical data that has been labelled: this transaction was legitimate, this one was fraudulent, this one was a duplicate payment. The model learns the characteristics that distinguish anomalous transactions from normal ones.
The challenge with supervised learning for fraud detection is data imbalance. Fraudulent transactions are, by definition, rare. A dataset might contain 100,000 legitimate transactions and 50 fraudulent ones. Training a model on this imbalanced data tends to produce a model that classifies everything as legitimate (achieving 99.95% accuracy while catching zero fraud).
Techniques to address this include:
- Oversampling minority class examples using methods like SMOTE (Synthetic Minority Over-sampling Technique), which generates synthetic fraudulent examples based on the characteristics of real ones
- Cost-sensitive learning, where the model is penalised more heavily for missing a fraud case than for flagging a legitimate transaction
- Ensemble methods that combine multiple models, each trained on different balanced subsets of the data
Financial institutions have deployed supervised fraud detection for credit card transactions for over a decade. Visa's Advanced Authorization system processes 76,000 transactions per second, scoring each for fraud risk. These systems demonstrate that supervised ML can operate at massive scale with high accuracy, but they require substantial labelled training data.
Unsupervised Learning: Finding What You Did Not Know to Look For
The most powerful aspect of unsupervised anomaly detection is its ability to identify patterns that no one explicitly defined as suspicious. Rather than learning from labelled examples of fraud, unsupervised models learn the normal patterns in accounting data and flag anything that deviates.
Clustering algorithms group similar transactions together. A transaction that does not fit neatly into any existing cluster, an outlier, is flagged for review. If most office supply purchases cluster between EUR 20 and EUR 200, a purchase of EUR 3,400 from an office supply vendor stands out.
Autoencoder neural networks learn to compress and reconstruct normal transaction patterns. When presented with an anomalous transaction, the reconstruction error is high because the anomaly does not match the patterns the network learned. This approach has shown strong results on financial datasets without requiring any labelled fraud examples.
Isolation forests, an algorithm specifically designed for anomaly detection, work by randomly partitioning data. Anomalous data points, being different from the majority, are isolated in fewer partitions. The algorithm is computationally efficient and handles high-dimensional data well, making it suitable for transaction datasets with many features.
The advantage of unsupervised approaches is that they can catch novel fraud schemes that no one has seen before. A supervised model trained on known fraud patterns might miss a new scheme that does not resemble historical cases. An unsupervised model flags anything unusual, regardless of whether it matches a known pattern.
Real-World Detection Examples
The practical applications of anomaly detection in accounting are diverse and growing:
Near-duplicate invoice detection. A supplier sends invoice #INV-2847 for EUR 1,234.00 and invoice #INV-2874 for EUR 1,234.00. A human processing these weeks apart might not notice the similarity. An AI system comparing all invoices from the same supplier immediately flags the near-duplicate: same supplier, same amount, transposed digits in the invoice number. Approximately 30% of duplicate payments involve slightly modified invoice numbers rather than exact duplicates.
Ghost vendor identification. AI can cross-reference vendor databases with employee records, identifying matches in addresses, phone numbers, bank accounts, or tax identification numbers. A vendor that shares a bank account with an employee's spouse is not necessarily fraudulent, but it warrants investigation. The Association of Certified Fraud Examiners reports that billing schemes involving fictitious vendors account for approximately 20% of occupational fraud cases.
Expense pattern analysis. An employee consistently submits expense claims of EUR 48-49 when the receipt-free threshold is EUR 50. Individually, each claim is unremarkable. In aggregate, the pattern is statistically unlikely to occur naturally and suggests deliberate manipulation. Human reviewers, seeing each claim in isolation, would never notice this pattern across hundreds of claims.
Vendor bank detail changes. Business email compromise (BEC) fraud, where attackers send fake emails requesting changes to vendor payment details, caused USD 2.9 billion in reported losses in 2023 according to the FBI's Internet Crime Complaint Center. AI systems that flag vendor bank detail changes occurring shortly before large payments provide a critical defence against this attack vector.
Seasonal deviation detection. A restaurant's food supply costs normally spike in December (holiday season) and drop in January. If food costs spike in February without a corresponding revenue increase, the anomaly detection system flags the deviation. The explanation might be innocent (a large catering contract, price increases from suppliers) or concerning (theft, waste, fictitious purchases).
The False Positive Problem
Every anomaly detection system faces the false positive challenge. Flag too few anomalies and you miss real problems. Flag too many and reviewers develop "alert fatigue" and start ignoring the alerts entirely.
Research in cybersecurity, which faces the same challenge, shows that when false positive rates exceed approximately 90% (nine out of ten alerts are false), human reviewers effectively stop investigating. The system becomes noise. This finding, documented in research from Google's security team, applies directly to accounting anomaly detection.
The solution involves multiple strategies:
Contextual scoring. Rather than a binary flag (anomalous or not), the system assigns a risk score that considers multiple factors. An unusual amount from a known, long-standing supplier scores lower than the same unusual amount from a vendor added last week. The risk score determines priority, not just detection.
Feedback loops. When a human reviews a flagged transaction and marks it as legitimate, that feedback trains the model. Over time, the system learns which types of deviations are genuinely concerning and which are normal variations for that specific business.
Tiered alerts. High-confidence anomalies (new vendor, large amount, bank details recently changed) trigger immediate review. Lower-confidence anomalies (slightly unusual amount from known supplier) are batched into a periodic review report rather than generating individual alerts.
Business context integration. A spike in construction material purchases is anomalous for an accounting firm but perfectly normal for a property developer during a project. The anomaly detection system must understand the business context to calibrate its expectations.
Complementing Human Judgment
Anomaly detection AI does not replace the accountant's judgment. It augments it by directing attention to where it is most needed. A qualified accountant reviewing a flagged transaction brings contextual knowledge that the AI lacks: understanding of the client's business, awareness of upcoming projects that explain unusual spending, and professional judgment about materiality.
The most effective implementation treats anomaly detection as a triage tool. The AI processes thousands of transactions and identifies the twenty that warrant closer examination. The human examines those twenty, applies professional judgment, investigates where necessary, and resolves each flag as either a genuine issue or an acceptable deviation.
This division of labour plays to each party's strengths. The AI is tireless, consistent, and capable of holding millions of data points in memory simultaneously. The human is contextually aware, capable of judgment, and legally accountable for the conclusions reached. Neither can do the other's job as well, and together they achieve a level of oversight that neither could provide alone.
Michael Cutajar, CPA — Founder of Accora.