How we built Sieve: a 6-stage transaction classifier that runs entirely on your device
Sieve is the classification engine inside SiftDo. It categorizes your bank transactions with 99%+ accuracy, runs entirely in-process — no network calls, no cloud API — and works in both a web browser and an iOS WKWebView. Here's how it's built.
The problem with transaction descriptions
Bank transaction descriptions are not designed for humans. A coffee shop visit shows up
as SQ *VERVE COFFEE ROASTERS 408-298-XXXX CA. A recurring Netflix charge
looks like NETFLIX.COM 866-579-XXXX CA. A Zelle transfer arrives as
ZELLE PAYMENT FROM JOHN S with no category information at all.
A naive classifier would either over-rely on keyword matching (fragile, high maintenance) or fire up a cloud LLM (slow, expensive, privacy-hostile). Sieve does neither.
Six ordered stages
Each transaction passes through stages in order. The first stage to produce a confident result wins; subsequent stages are skipped. The six stages are:
- User corrections — if the user previously recategorized this merchant, that mapping is applied first. User intent always wins.
- Rules engine — deterministic pattern matching against a library of merchant patterns and bank-specific signals (CC payment detection, payroll recognition, bank fee tags). No model inference needed for well-known patterns.
- Model inference — a trained classifier runs in WebAssembly or pure JS depending on the environment. Handles novel merchants the rules engine doesn't recognize.
- Field extraction — pulls structured fields from the description: amount signs, transfer references, merchant names with noise stripped.
- Post-model reclassification — catches systematic model errors. For example, the model sometimes misclassifies high-value round-number credits as "Income" when they're transfers. This stage corrects those.
- Confidence review — transactions below the confidence threshold are flagged for manual review rather than silently miscategorized.
The merchant database
Sieve ships with a local merchant database seeded from 715 patterns extracted from the rules engine. Lookups happen against IndexedDB (on desktop) or in-memory (on iPhone), so there's no parsing overhead per transaction. When a merchant is seen for the first time and doesn't match an existing pattern, the model takes over.
Running on iPhone
Swift apps can't run JavaScript natively, so we built a bundler step
(npm run build:iphone) that compiles Sieve into a single
sift.bundle.js file loaded by a JSEngine.swift wrapper.
The entire classification pipeline runs inside a WKWebView JavaScript
context — same code, same results, no server needed on mobile.
Sieve is a TypeScript package (packages/sieve/) with its own test suite. If you're interested in contributing a merchant pattern or parser improvement, reach out.