Most “AI security” tools sold today get one thing right and one thing wrong. They are right that an LLM can do an enormous amount of the routine work of an engagement — recon, evidence capture, finding documentation, report drafting. They are wrong about who runs the engagement. The default answer is the model. We think the default answer is the operator.
Battle Ready Armor is built around that distinction, and today we’re releasing the free tier — BRA Slim — as a single self-contained binary you can download, point at a target you’re authorized to assess, and run.
The thing AI security tools keep missing
It is not actually hard to get an LLM to “do a pentest.” It will happily run port scans, fetch web pages,
propose payloads, summarize what it found. The hard part is the same as it has always been: making sure the thing under test is in scope, the action you’re about to take is authorized, the evidence you captured is
preserved, and the rules your customer signed up for actually hold.
When the operator is the executor, those guardrails live in their head. When the model is the executor, the
guardrails have to live somewhere mechanical. A system prompt is not somewhere mechanical. A tool description is not somewhere mechanical. “I told the model to be careful” is not control. It’s a hope.
BRA’s position: control is explicit state, not prompt-only or SDK hook compliance. Approval tokens are files on disk that gate every relevant operation. Scope is a structured document the agent’s actions are checked against. Findings flow into a governed schema with hash-linked evidence. The framework holds the line whether the operator is watching or not.
What’s in BRA Slim
The free tier ships the full engagement loop:
- Scope intake wizard. Privacy posture, target name + version, in-scope IPs/CIDRs/URLs/file paths, out-of-scope exclusions, time window, written-authorization attestation, active-testing permission, per-engagement safety-approval matrix. Identifiable values (IPs, hostnames, credentials) are auto-detected, surfaced for review, and anonymized before they reach the model.
- Live engagement, governed every step. The agent runs reconnaissance, captures evidence to a versioned artifact directory, and proposes tool calls. Each tool call hits a governance check before it executes — destructive ops, scope violations, missing approval tokens, dangerous-command patterns. Subdomains and unrelated targets that surface mid-engagement trigger an in-app modal asking the operator to grant Full / Limited / Deny /Suggest.
- Operator-recorded overrides. Every approval decision can be promoted into an override rule the framework remembers — “always allow WebFetch for this engagement,” “deny anything that touches *.staging.example.com,” “session-allow Bash curl invocations.” Rules are inspectable and revocable.
- Three living records. Intel (services, endpoints, infrastructure facts), Leads (investigation threads worth
pursuing), and Findings (confirmed vulnerabilities). Every entry carries a hash-linked reference to the
originating tool call and the raw captured output. - Structured report generation. When you end the engagement, you select which findings to include, complete a report intake (researcher, vendor/client, contact), and the framework produces an executive summary, technical summary, intelligence highlights, prioritized recommendations, immediate / short-term / long-term remediation roadmap, and a vendor disclosure timeline.
- Operator profile (Battle Card view). The same data, rendered as a trading-card style spread you can use to brief stakeholders or demo the framework to a team.
You get one agent backend in slim — Anthropic / Claude direct — and the regex+dictionary anonymization mode (Dumb Scope / Map Scan). Premium adds multi-provider routing, on-host LLM, NER-driven scope and live-chat anonymization, the full knowledge surface (skills / tools / methodologies / workflows), the operator control plane (debug, harness explorer, live diagnostics), and the rest of the governance-token catalog.
Two design choices worth calling out
Anonymization is a layer, not a feature. When you tell BRA the target is gainsec.com, the framework mints an anonymous noun (otter) before anything reaches the model. The agent’s entire view — scope document, tool calls, conversation history — uses the anonymized form. Outbound tool invocations are rehydrated to the real value at execution time. Findings, intel, and reports are stored against the anonymized form on disk; the operator-only anon_map.json is what reverses it.
The agent never sees the real target name. If the framework stops, the model never had it. If the model is
replaced tomorrow, the layer doesn’t change. That’s the kind of property “tell the model not to leak” can never give you.
Governance survives replacement. Models change. Tools change. Operators change. UIs change. A governance model that depends on any of those is not a governance model — it’s a configuration. BRA’s controls are checked by mechanical layers (approval-token files, scope-validated tool gates, append-only evidence files) that don’t care which model is running, which tool is being invoked, or which operator is at the keyboard. Replace any of them and the controls hold.
How to run it
Download the binary for your platform from the Releases page, make it executable, and run:
./bra-slim –bra-dir ~/bra-engagement
Open the URL it prints, drop your Anthropic API key into the Config tab, click Operations → Start New
Engagement, and walk through the wizard. The first engagement on a real target you own should take ~5 minutes from launch to your first finding.
macOS Apple Silicon and Linux x86_64 binaries ship today. ~80 MB each. No installer, no service to manage, no source to compile.
What this is, and what it is not
BRA Slim is not a magic vulnerability scanner. It does not find bugs you couldn’t have found yourself with curl, nmap, and an afternoon. What it does is take the routine, time-consuming part of an engagement — the hour you spend pulling DNS records, the second hour writing up findings, the third hour producing a report — and compress it into something you can do alongside a model without giving up the controls that make the work defensible.
The AI is advisory. The operator is authoritative. The framework is the thing in the middle making sure that
distinction holds.
Get it on GitHub: HERE
Read more about it HERE
END TRANSMISSION
