Agentic AI or Deterministic Engine? Why Underwriting Extraction Should Be Boring

The phrase “agentic AI” is having a moment in the UK lending press. Vendors are queueing up to put it on their landing pages, and trade publications are running thought pieces about what it means for SME credit. It’s worth a closer look — because for the specific job of turning a bank statement into structured underwriting data, an agent is the wrong shape of tool.

The work of extraction isn’t a creative task. It’s a deterministic one. And the lenders who’ll get the most out of automation are the ones who notice the difference.

What “Agentic” Actually Means

An agent, in current usage, is an LLM that can plan, choose tools, take actions, and decide when it’s done. It’s good at open-ended work — research, summarisation, drafting, customer enquiries — where the right answer isn’t known in advance and judgement is part of the value.

That’s a poor description of bank statement extraction. The right answer is known in advance: every transaction on every page, with the correct date, description, amount, and balance. There is one correct output. The job is to produce it reliably, every time, with a paper trail.

Asking an agent to do this is like hiring a novelist to read out a phone book. It might work. The interesting question is why you’d want it to.

Where Determinism Earns Its Keep

ExactSum’s parsing layer is deterministic by design. The same statement processed twice produces the same numbers twice. Every transaction can be traced back to the page it came from. Nothing is averaged, smoothed, or quietly inferred.

That matters for four reasons:

Auditability. When the FCA, an internal auditor, or a borrower asks how a number was produced, “the agent decided” is not an answer. “Here is the line item on page 47 of the August statement” is.
Reproducibility. A model that gives subtly different answers on Tuesday than it did on Monday is not a model you can build underwriting policy on. Thresholds only mean something if the inputs are stable.
Failure modes. When deterministic parsing breaks, it breaks visibly — a missing column, an unrecognised layout. When a generative model breaks, it produces a plausible but wrong number. The first is fixable. The second is dangerous.
Cost. Running an agent over a 200-page statement is expensive and slow. Running a deterministic pipeline is neither.

The lenders processing hundreds of files a week don’t want a system that probably gets the revenue total right. They want one that demonstrably does.

Where AI Actually Belongs

This isn’t an anti-AI argument. ExactSum uses computer vision and language models in the parts of the pipeline where they’re genuinely the right tool — recognising that a particular block of text on page one is a header, that a row in a Santander statement is a fee disclosure rather than a transaction, that a Revolut PDF contains four currencies. Pattern recognition on noisy visual inputs is exactly what these models are good at.

What we don’t do is hand the model a statement and ask it to “figure out the numbers”. The extraction is structured, validated, and reconciled against the statement’s own balance arithmetic before any underwriting metric is computed.

The architecture, roughly:

Vision and OCR identify what’s on the page — text blocks, columns, tables, layout.
Deterministic parsing assembles those into transactions, with validation against opening/closing balances.
Aggregation rolls transactions into underwriting metrics (cashflow, balances, concentration, tax behaviour).
The Decision Engine evaluates metrics against your configured rules and returns Auto-Approve, Refer, or Decline.

AI is in step one. From step two onwards, the system is rules, arithmetic, and traceable logic. That’s not a limitation — it’s the point.

The Governance Test

Here’s a useful test for any extraction tool a lender is evaluating: ask the vendor to run the same statement through their pipeline three times in a row, on three different days, and compare the outputs digit for digit.

A deterministic engine will return identical numbers. A model-driven one usually won’t — and the differences will be small enough to look like rounding, until the day one of them flips a decision.

A second test: ask the vendor where, exactly, the system inferred something the statement didn’t say. If the answer is “nowhere”, you have a tool that’s safe to build underwriting policy on. If the answer involves hand-waving about confidence scores, you have a research project.

What This Means For Lenders

The market is converging on a sensible position: automate the data layer so underwriters can spend their time on judgement. We agree with that. The disagreement is on what the data layer should look like.

For a regulated lender, the data layer should be the most boring part of the stack. Predictable, reproducible, auditable, and cheap to run. The interesting work — the part that needs experienced humans — happens after the numbers are on the page. That’s where the lender’s edge lives, and it’s where AI can genuinely help: summarising for committee, drafting commentary, flagging anomalies in plain English.

But the numbers themselves? Those should be boring. We’ve built ExactSum that way on purpose.

See How a Deterministic Engine Handles Your Files

We'll run a sample of your real statements through the pipeline, show you the output, and run them again so you can see they match.

Book a Demo