We have watched bright teams drown in dashboards that all describe the same Tuesday with different adjectives. Security wants a crisp “bad actor” line; risk wants triage; support wants a customer who stops calling. The part people skip is the boring one: a dictionary that is stable enough to train new humans, and a change log for rules that would stand up in court without sounding like a fantasy novel.

A base rate that will not get politer if you ignore it

Model metrics without population math are just vibes. Picture a notional portfolio where true fraud is one in a thousand events—0.1%—and your champion rule is “90% sensitive” and “99% specific.” Pretty on a slide, brutal on an ops floor. On a million events you expect about 1,000 frauds. True positives, at 90% sensitivity, land near 900. False positives, at 1% of the 999,000 legitimate events, land near 9,990. A human staring at a queue is not looking at “precision in the model card”; they are looking at 900 true needles mixed into roughly ten thousand straws—before lunch.

The Bayes one-liner (no Greek letters, just consequence)

The probability a flagged event is actually bad is not the model’s “precision in isolation”; it is the co-production of base rate × rule behavior. When “rare but awful” is in the same pipeline as “common and messy,” the operator’s question is not “is the AUC good?” it is how many honest people per week get pulled into a manual journey, and who owns the apology and the data retention when we get it wrong. The arithmetic is a kindness: it depersonalizes the fight.

What retention costs before legal ever opens the folder

“Keep everything for 90 days” is a sentence that has never read a storage bill. A toy: 1,000,000 events a day, 2 KB average JSON per event (compressed on disk, rounded), 90 days. Roughly 1e6 × 2 × 103 × 90 bytes—on the order of 180 GB of raw event payload, before replicas, before indexes, before “just one more field for the model team.” The point is not the exact terabyte. The point is that the word “stream” is not abstract; it is a line item, and a backup window, and a person who will have to redact a field when someone finally asks a GDPR-shaped question. Ownership means deciding which fields are worth that rent.

Signal custody, the phrase we actually put on a whiteboard

Our short programs keep returning to: signal custody. Not “we log everything,” but “we know which fields are allowed to accuse a person of intent, and which fields are only allowed to describe a system hiccup.” The difference is not pedantry. It is the difference between a person getting frozen out of a wallet for a reason that is reproducible, and a person getting frozen out because a queue burped and nobody took ownership of the false positive that followed. We role-play the Tuesday where two teams rename the same event without a migration plan, and the count of “fraud” in a monthly report doubles because the definition moved. Mathematics cannot save you if your dictionary has forked; only humans negotiating can.

A queue depth that is a people problem

If a rule fires 12,000 times a week and your team of six can deeply review 400, the rest is not “machine learning background noise”—it is a backlog that trains cynicism. We teach a boring discipline: a single numeric operational service level (even home-grown) for “time to first human decision,” a single owner for “this rule is frozen until the change log says otherwise,” and a place where product agrees that velocity without capacity is not a feature. You do not need a consultant’s slide deck. You need a line in the runbook that does not flinch.

The Friday hotfix and the Monday policy

We are not here to turn engineers into full-time risk officers. We are here to make sure the Friday hotfix you ship because someone shouted in a meeting does not turn into a Monday policy that contradicts the Tuesday contract with your processor. The fix is not more software; it is more adult conversations, earlier, with the same nouns. If you want a partner who will role-play the uncomfortable part with you, you already know how to reach us. Bring your actual false-positive rate, even if the number embarrasses you—numbers, unlike pride, can be budgeted against.

Back to all news