Knowledge layer
A typed knowledge graph authored in Markdown with YAML frontmatter. A compiler parses, validates, graphs, chunks, and embeds everything before it enters storage.
protectwith.ai is a typed knowledge graph and MCP server for AI security — covering both AI for security and security for AI, across personal, SMB, and enterprise contexts. Built entirely on Cloudflare: D1, Vectorize, Workers AI, Durable Objects, and Access.
Endpoints
Three coordinated layers — a typed knowledge graph, an MCP server, and a coordinated agent fleet — built entirely on Cloudflare primitives.
A typed knowledge graph authored in Markdown with YAML frontmatter. A compiler parses, validates, graphs, chunks, and embeds everything before it enters storage.
Knowledge served to AI agents over the Model Context Protocol. A Cloudflare-native McpAgent (Durable Object) backed by D1 and Vectorize, protected by Cloudflare Access.
A coordinated set of security agents built on the MCP server, with human-on-the-loop oversight and least-privilege tool governance.
Every tool is read-only by default. AI agents connect to mcp.protectwith.ai/mcp via the Model Context Protocol; Cloudflare Access verifies identity before any tool call reaches the server.
Semantic question-answering over the knowledge base. Embeds the query with Workers AI (bge-base), searches Vectorize by cosine similarity, returns the most relevant entity chunks with provenance.
Fetch a single entity by its unique ID. Returns the full structured record from D1 — type, content, relationships, provenance, and verification status.
Traverse typed relationships in the knowledge graph. Given an entity ID and an optional relationship type (e.g. mitigates, implemented_by), returns all connected entities.
Cross-framework mapping: translate a concept or identifier from one standard to its equivalents in another, then follow the graph to concrete controls and the Cloudflare primitives that implement them.
Enumerate entities in the knowledge base, optionally filtered by type or segment. Returns IDs, names, and types — useful for agent discovery before a deeper kb_get or kb_related.
Advisory, KB-grounded classification: match an input (e.g. a prompt) against known threat patterns and return likely threats plus recommended controls. Advisory only — not a real-time detector.
Generate the applicable security control checklist for a described component (e.g. "a Worker calling an LLM"), with the Cloudflare primitives that implement each control.
Map controls to the compliance requirements they satisfy across NIST AI RMF, the EU AI Act, and ISO/IEC 42001.
// Query: map MITRE ATLAS AML.T0051 through the full control chain
tool: kb_map
input: "AML.T0051" // LLM Prompt Injection
→ Step 1 resolve
MITRE ATLAS AML.T0051 "LLM Prompt Injection"
→ Step 2 maps_to
OWASP LLM LLM01:2025 "Prompt Injection"
→ Step 3 mitigates (controls)
→ CTL-001 Input validation & sanitization verified
→ CTL-007 Prompt boundary enforcement verified
→ CTL-012 Output filtering before execution verified
→ Step 4 implemented_by (Cloudflare primitives)
→ Cloudflare AI Gateway rate-limit, log, redact PII
→ Workers AI sandboxed inference
→ Cloudflare WAF block known injection patterns
// All results carry source URLs and verification status
Every layer of protectwith.ai was designed to resist the attack patterns it documents. Defense-in-depth applies to the system itself.
All requests to the MCP server go through Cloudflare Access. Identity is asserted via a signed JWT — the server never trusts a user-supplied field or header.
All eight MCP tools are read-only. This directly mitigates MCP "tool poisoning" — an attacker who gains access to the server cannot write, modify, or delete knowledge.
Public/internal scope separation is enforced three independent ways: a Vectorize metadata filter, a D1 query filter, and a per-entity get() re-check after retrieval.
A CI validation step rejects any pull request that introduces dangling relationships, illegal edge types, or a "verified" claim without accompanying source references.
Every entity carries explicit source URLs and a verification status field (verified, unverified, or time-sensitive). Agents can expose this to end users.
47 automated tests including adversarial scope-isolation scenarios — tests that actively attempt to leak internal entities through crafted queries to verify isolation holds.
API keys, access credentials, and private strategy documents are never committed to the repository. Secrets are injected via environment variables at deploy time.
Security research is verified adversarially before it enters the knowledge base — claims are checked against primary sources; contested or evolving material is marked time-sensitive.
protectwith.ai maps its knowledge graph to the leading AI security and governance frameworks — so agent outputs can cite the exact standard and control they reference.
The ten most critical security risks in systems that use large language models — from prompt injection to model theft.
Used: Each OWASP LLM risk is a first-class entity in the knowledge graph, linked to mitigating controls and Cloudflare primitives that implement them. Cross-mapped to MITRE ATLAS via kb_map.
Fifteen specific threat patterns that arise when AI systems operate as autonomous agents — including orchestration hijacking, excessive agency, and resource misuse.
Used: T1–T15 are mapped as threat entities; the agent fleet design (read-only tools, human-on-the-loop, least-privilege governance) directly addresses T2, T5, T9, and T11.
An emerging standard for the most critical risks specific to agentic AI deployments, covering multi-agent orchestration and tool-use vulnerabilities.
Used: Applied to the roadmap agent fleet design — each top-10 risk is tracked as a knowledge entity with controls and implementation status.
A structured approach to identifying, assessing, and managing risks across the full AI lifecycle (Govern, Map, Measure, Manage).
Used: Governance practices (provenance tracking, adversarial verification, reproducible compiler) map to GOVERN and MEASURE functions. The CSA Agentic NIST AI RMF Profile guides agent fleet design.
Adversarial Threat Landscape for AI Systems — a knowledge base of adversarial ML tactics and techniques, structured similarly to MITRE ATT&CK.
Used: ATLAS tactics and techniques are entities in the graph. kb_map can translate an ATLAS ID (e.g. AML.T0051) to the corresponding OWASP risk, mitigating controls, and Cloudflare implementation.
An international standard for establishing, implementing, and continually improving an AI management system within an organization.
Used: The knowledge compiler's idempotency, CI validation, and provenance requirements reflect 42001's emphasis on documented, repeatable AI processes and risk controls.
The European Union's landmark regulation classifying AI systems by risk level and requiring transparency, human oversight, and conformity assessments for high-risk deployments.
Used: Transparency requirements informed the verification_status field and provenance sources on every entity. Human-on-the-loop in the agent fleet addresses the Act's human oversight requirements.
The practices protectwith.ai documents are also the practices it follows — the system is built to embody the guidance it gives.
Security controls at every layer: identity gate, read-only tools, scope isolation, schema validation, and adversarial testing — no single control is relied upon exclusively.
AI agents are granted only the access they need. All MCP tools are read-only; write capabilities require explicit, separately governed tooling.
Identity is always verified via Cloudflare Access signed JWTs. No user-supplied header, field, or parameter is trusted for authorization decisions.
Each entity in the knowledge graph carries source URLs and one of three statuses: verified, unverified, or time-sensitive. Agents surface this to users.
Security research is checked against primary sources before entering the knowledge base. Claims that can't be independently verified are marked unverified or excluded.
The boundary between public and internal knowledge is not just enforced — it's actively tested with adversarial queries designed to probe for leakage.
The knowledge compiler produces identical output from identical input. Any entity can be re-compiled at any time with the same result — no hidden state.
API keys, access credentials, and private configuration are injected at deploy time via environment variables — never stored in the repository or in compiled artifacts.
Internal roadmaps, threat models, and strategic planning documents are kept out of the public repository — only the platform itself and its documented behaviors are public.