Back to CIVRA Civra
STAGE 03 + 04 Deep scan · ML analysis & sandbox

Read the intent.
Detonate the threat.

Most malicious mail is loud — bad links, junk senders, obvious spam — and our first passes clear all of it in under a millisecond. What's left is the dangerous part: a small slice of messages engineered to look completely legitimate. CIVRA takes that slice apart in two layers. First, a purpose-built machine-learning model reads the message — its tone, its intent, the exact words — the way a seasoned analyst would. Then, if a link or attachment is still in question, CIVRA detonates it inside a sealed sandbox and watches what it actually tries to do.

~5ms
for the model to read & score
2–5%
reach the deep-scan model
<1%
detonated in the sandbox
0
of it touches your network
01 — The problem

Spam filters score reputation. They don't read intent.

A traditional filter asks: is this sender known? does this look like bulk spam? does the link sit on a blocklist? Those questions catch noisy attacks. They are useless against the most expensive one — Business Email Compromise, where there's no malware and no bad link. Just a clean, well-written message from a plausible address asking your finance lead to change wire details, or your CEO to approve an "urgent, confidential" payment.

These attacks cost businesses $2.77 billion a year because they're written to pass every reputation check. The only way to catch them is to actually understand the message — its tone, its pressure, the relationship between who's asking and what they're asking for. That's a language problem, not a filtering problem. It's the one thing rules can't do — and exactly what the deep scan is built for.

Where it fits

A funnel, not a bottleneck.

Every inbound email flows through four stages. The cheap, instant checks come first and dispose of the vast majority of mail for effectively nothing. Only the genuinely ambiguous messages — the ones the rules can't confidently call safe or dangerous — are escalated to the AI. That's what keeps the slow, careful analysis reserved for the few messages that truly need it.

01 · INGEST

Email arrives

A read-only copy reaches CIVRA over a secure link to Microsoft 365 or Google Workspace.

100% of mail
02 · FIRST PASS

Rule-based filtering

Allow/block lists, SPF·DKIM·DMARC, domain age, link & header checks — a fast risk score.

~95% resolved here
You are here 03 · DEEP SCAN

AI analysis

The ambiguous slice is read by a purpose-built model that judges intent and meaning.

2–5% escalated
04 · DETONATION

Isolated environment

If a link or file still looks risky, it's opened in a sealed sandbox and watched.

<1% detonated
Layer one — the model reads
02 — How it reads

It asks the question a person would ask.

The model isn't matching keywords. It was trained on hundreds of thousands of real and simulated attacks to recognise the shape of a scam — the way urgency, authority and a financial ask combine into something that doesn't add up. When a message lands, it weighs three kinds of signal at once.

Intent & tone

What is it asking for?

Manufactured urgency, secrecy, an out-of-band request, a change to payment details — the social-engineering moves that pressure a person into acting before they think.

Identity & relationship

Does the ask fit the sender?

A "CEO" emailing from a freshly-registered look-alike domain about a confidential wire is a mismatch a human notices instantly — and so does the model.

Language patterns

Which exact words give it away?

Credential-harvesting prompts, invoice-fraud phrasing, fake legal pressure — the model marks the specific phrases that drove its decision, not just a final score.

One pass · three answers

Every scan returns structured data — never a paragraph of opinion.

In a single forward pass, the model produces three coordinated outputs. Because they're structured fields and not free-form text, your dashboard can act on them automatically — and there's nothing for an attacker to talk their way out of.

Output 01 · Classification

What kind of threat is it?

The message is sorted into one of six threat types — or marked safe. This drives how the alert is routed and what playbook your team sees.

Output 02 · Risk score

How dangerous, 0 to 100?

A calibrated severity number, so high-risk mail escalates automatically and the rest stays quietly out of the way. No alert fatigue.

Output 03 · Risky phrases

Which words proved it?

The exact spans of text that drove the verdict, each tagged with severity — so the reason is always visible, never a black box.

03 — Worked example

One real attack, taken apart.

Here's a textbook Business Email Compromise attempt — no malware, no bad link, nothing for a spam filter to catch. Watch what the deep scan sees that a reputation check never could.

Flagged · High risk
From Michael Reyes, CEO <[email protected]>
Subject Confidential — need this handled today

Hi Sarah,

Are you at your desk? I'm tied up in a board call and can't take phone calls for the next few hours.

We're closing an acquisition and I need you to process a wire transfer of $48,500 to our new vendor before end of day. Please keep this confidential until the announcement — it's market-sensitive.

This needs to go out today. I'll send the updated banking details in a moment. Let me know once it's done.

Sent from my iPhone

Verdict
0RISK / 100
Business Email
Compromise
classification confidence 98.4%
Why it fired
  • HighReply-to address doesn't match the sender's display domain — a classic spoof tell.
  • HighExecutive authority paired with an urgent, out-of-band wire request.
  • MedExplicit secrecy ("keep this confidential") isolates the target from verifying.
  • MedTime pressure ("today", "before end of day") to short-circuit due diligence.

Plain-English explanation This message impersonates a senior executive to push an urgent, confidential wire transfer — the signature pattern of Business Email Compromise. The display name claims to be your CEO, but replies would route to an unrelated personal address. Verify any payment change through a known phone number before acting.

The taxonomy

Six things the model can tell apart.

Naming the attack matters — a wire-fraud BEC needs a very different response than a credential-harvesting page. The model doesn't just say "bad"; it says which kind of bad.

Business Email CompromiseCritical

Impersonating an executive or vendor to redirect a payment — the costliest category by far.

PhishingHigh

Luring a recipient to a fake login or page designed to capture data or credentials.

Payment FraudHigh

Fake invoices, altered banking details and "overdue" pressure aimed straight at accounts payable.

Credential TheftHigh

Password-reset and account-security lures engineered to harvest logins.

SpoofingElevated

Forged sender identity — display-name tricks, look-alike domains, failed authentication.

MalwareElevated

Attachments and links that aim to deliver a malicious payload onto a device.

04 — Why this design

A specialist classifier — not a chatbot.

It would be easy to throw every email at a giant general-purpose chatbot and ask "is this phishing?" We deliberately don't. A compact model fine-tuned for exactly one job is faster, cheaper, more private — and far safer in an adversarial setting where the input is, by definition, written by an attacker.

CIVRA · purpose-built model

Decides. Doesn't improvise.

  • Deterministic. The same email always gets the same verdict — essential for audits and trust.
  • Can't hallucinate. It outputs scores and labels — structured fields — never an invented sentence.
  • Immune to prompt injection. It scores the words "ignore previous instructions" — it doesn't obey them.
  • ~5 ms on ordinary hardware. No giant external model in the loop, so your mail stays close to home.
A general-purpose chatbot

Helpful — and that's the risk.

  • Same email, different answer — phrasing and randomness make it inconsistent.
  • Can confidently invent reasons or risk levels that aren't grounded in the message.
  • A crafted email can hijack its instructions — the email is the prompt.
  • Slower and pricier per message, and your mail leaves the building to be read.
Layer two — the sandbox detonates
05 — Detonation

Reading goes only so far. So we open it.

Some threats don't live in the words — they live behind a link or inside a file. A message can read as clean and still carry a URL that quietly redirects to a credential trap, or a document that runs the moment it's opened. When the model can't settle a live link or attachment, CIVRA stops guessing and detonates it: it opens the artifact inside a sealed, throwaway environment — fully sealed off from your network — and simply watches what it does. Real threats give themselves away when they think no one's looking.

Ephemeral session · isolated
00.0▸ Detonating link from "OneDrive file shared with you"
hxxps://onedrive-share-docs[.]net/d/inv-48500
00.4↳ 302 redirect → hxxps://login.micros0ft-secure[.]com/auth
01.1◉ Renders a pixel-perfect Microsoft 365 sign-in page
01.4⚠ Password field posts to a non-Microsoft domain
02.1⚠ Test credentials submitted → POST 193.43.x.x
02.5✕ No legitimate Microsoft endpoint ever contacted
VERDICT — credential phishing · session destroyed
Behavioral verdict
Credential phishing
— confirmed
behavioral confidence 99.1% · sandbox destroyed after capture
What it observed
  • HighLogin form posts credentials to a domain unrelated to Microsoft.
  • HighLook-alike host impersonating the real sign-in page (a "0" for an "o").
  • MedA 302 redirect cloaks the true destination from the recipient.
  • MedPage renders only after referrer checks — deliberately evasive.
Indicators extracted
domainonedrive-share-docs[.]net
domainmicros0ft-secure[.]com
ip193.43.x.x
↳ attached to the alert · pushed to your blocklist
What gets opened

Anything that could act

Suspicious links, multi-hop redirect chains and QR-code destinations (quishing), plus attachments — PDFs, Office documents with macros, and archives.

What it watches for

The behaviour, not the claim

Where a link really lands, fake login forms that post off-domain, files that write to disk or spawn processes, and quiet callbacks to attacker-controlled servers.

What it brings back

Proof, ready to act on

A behavioral verdict, a screenshot of the real landing page, and the indicators of compromise — domains, IPs, file hashes — all attached to the alert.

Sealed by design

A bomb range, not your office.

Ephemeral & disposable

A fresh environment is built for each detonation and destroyed the moment it's done. Nothing persists.

Off your network

It runs in complete isolation from your systems — whatever the threat does, it can never reach a real device.

Full behavioral capture

Every redirect, rendered page, form action, download, process and network call is recorded as evidence.

A definite answer

"This looks suspicious" becomes "this is a credential trap" — backed by what the threat actually did.

Guardrails

Both layers run inside a safety harness.

The input is, by definition, written by an attacker — so the whole pipeline is built to never trust it blindly.

Before

Input is sanitised

Injection patterns, null bytes and control characters are stripped first — nothing hidden in an email can tamper with how it's analysed.

During

Output is validated

Every result is checked against a strict schema — a known label, an in-range score, phrase spans that map to real text. Malformed output is rejected, never surfaced.

After

A human has the final say

High-risk verdicts land in a triage queue with the score, classification and reasons attached. Confirm, dismiss, or open an incident — every decision is logged.

Privacy by design

Your mail is read to protect it — and nothing more.

Read-only access

CIVRA can never send, edit, move or delete your mail. It only ever observes.

Encrypted end to end

AES-256 at rest, TLS in transit. Message bodies are encrypted the moment they're stored.

PII de-identified

Sensitive personal data is detected and stripped before it's retained or logged anywhere.

Strict tenant isolation

Your organisation's data is walled off at the database level — never co-mingled with anyone else's.

In production

The numbers behind every verdict.

97.2%
Classification accuracy
trailing 30 days
0.40%
False-positive rate
and declining
~5ms
Time to verdict
per message
Improves over time
learns from confirmed outcomes

See it read your inbox.

Connect Microsoft 365 or Google Workspace in minutes — read-only — and watch the deep scan catch what your filter misses. No security team required.

Start free trial Back to how it works