The 6 Technical Checks Your AI System Needs Before August 2, 2026

March 26, 2026 · Jason Shotwell · 8 min read

The EU AI Act enforcement deadline is August 2, 2026. If you're shipping Python AI agents - using LangChain, CrewAI, AutoGen, OpenAI, or any RAG pipeline - your code needs to pass 6 specific technical checks mapped to Articles 9, 10, 11, 12, 14, and 15 of the regulation.

Most teams haven't started. We scanned typical agent code from 5 popular frameworks and found that none scored higher than 1/6 out of the box - and most scored 0/6. Here's what each check actually requires in your codebase, and how to fix it.

Check 1: Risk Management System (Article 9)

What the regulation says: Your AI system must identify and analyze known and foreseeable risks, with measures proportionate to the risk level.

What that means in code: Every tool call your agent makes needs a risk classification. When your LangChain agent calls a web search tool, that's LOW risk. When it executes arbitrary code or accesses a database with customer PII, that's HIGH or CRITICAL.

What we found: No risk classification detected in any framework. Agents can invoke any tool without risk assessment. Severity: HIGH

The fix:

# Before: agent calls tools with no risk awareness
executor = AgentExecutor(agent=agent, tools=tools)

# After: ConsentGate classifies tool calls by risk level
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(consent_gate=True)
executor = trust.wrap(AgentExecutor(agent=agent, tools=tools))

Risk classifications (LOW / MEDIUM / HIGH / CRITICAL) get logged automatically. A regulator can see exactly which actions your agent took and what risk level was assigned to each.

Check 2: Data and Data Governance (Article 10)

What the regulation says: You need data governance and management practices, including data minimization. Sensitive data must be handled appropriately.

What that means in code: PII, API keys, credentials, and other sensitive data should not flow directly to the LLM in plaintext. You need tokenization or redaction before the model sees it.

What we found: No data governance controls in any of the 5 frameworks. A user's email address, SSN, or API key typed into a prompt goes straight to the model. Severity: HIGH

The fix:

# DataVault tokenizes sensitive data before LLM processing
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(data_vault=True)
# PII is replaced with tokens like [PII_EMAIL_1] before reaching the model
# Original values are restored in the response

This is especially important for RAG pipelines that ingest company documents - those documents often contain PII, financial data, and internal credentials.

Check 3: Technical Documentation (Article 11)

What the regulation says: You must maintain a general description of the AI system, kept up to date, covering its intended purpose, design specifications, and how it works.

What that means in code: Your agent's operations need structured logging - not just print() statements or unstructured log files, but a documentation system that captures what the agent does, what tools it uses, and how it processes data.

What we found: No structured documentation in any framework. Agent operations are invisible. Severity: MEDIUM

The fix:

# AuditLedger automatically documents all agent operations
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(audit_ledger=True)
# Every agent action is recorded with timestamps, inputs, outputs,
# and tool invocations in a structured, queryable format

Check 4: Record-Keeping (Article 12) - CRITICAL

What the regulation says: Your AI system must automatically record events ("logs") with enough detail for regulatory verification. This is the article with the sharpest teeth.

What that means in code: You need tamper-evident logging. A regulator needs to be able to verify that your logs haven't been altered after the fact. Regular log files don't cut it - they're trivially editable.

What we found: This was flagged as CRITICAL severity across all 5 frameworks. No automatic record-keeping. Agent decisions and tool invocations are not recorded in any auditable format.

The fix:

# HMAC-SHA256 audit chains - tamper-evident logging
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(audit_chain=True)
# Each log entry is cryptographically chained to the previous one
# If anyone modifies a past entry, the chain breaks
# Regulators can verify the entire chain with one command

This is the check that most teams don't even know they need. "We have CloudWatch logs" is not going to satisfy Article 12 when the auditor asks you to prove those logs haven't been modified.

Check 5: Human Oversight (Article 14)

What the regulation says: AI systems must be designed so they can be effectively overseen by natural persons. Humans must be able to intervene, override, or stop the system.

What that means in code: Your agent needs a human-in-the-loop mechanism for sensitive actions. Not every action - that would be impractical. But high-risk tool calls (database writes, sending emails, executing code) should require human approval.

What we found: 4 out of 5 frameworks had no human oversight mechanism. The OpenAI SDK scored 1/6 partially because it doesn't have autonomous tool execution by default - but that's a technicality, not a compliance layer. Severity: HIGH

The fix:

# ConsentGate for human-in-the-loop on sensitive actions
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(consent_gate=True)
# HIGH and CRITICAL risk actions pause and wait for human approval
# LOW and MEDIUM actions proceed automatically but are still logged

Check 6: Accuracy, Robustness & Cybersecurity (Article 15)

What the regulation says: Your AI system must be resilient against unauthorized third-party attempts to alter its use or behavior. It must achieve appropriate levels of accuracy and robustness.

What that means in code: Prompt injection defense. If a user (or a document your RAG pipeline ingests) can trick your agent into ignoring its instructions, executing unauthorized commands, or leaking system prompts - you fail Article 15.

What we found: No cybersecurity defenses in any framework. Every agent we scanned is vulnerable to prompt injection attacks. Severity: HIGH

The fix:

# InjectionDetector for multi-layer prompt injection defense
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(injection_detector=True)
# Scans user inputs and retrieved documents for injection patterns
# Blocks or flags suspicious inputs before they reach the agent

The Scan Results: 5 Frameworks, All Failing

Here's what AIR Blackbox found when we scanned typical agent code from each major framework:

Framework	Score	Art. 9	Art. 10	Art. 11	Art. 12	Art. 14	Art. 15
LangChain	0/6	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL
CrewAI	0/6	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL
AutoGen	0/6	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL
OpenAI SDK	1/6	FAIL	FAIL	FAIL	FAIL	PASS	FAIL
RAG (LangChain)	0/6	FAIL	FAIL	FAIL	FAIL	FAIL	FAIL

This isn't a knock on these frameworks - they're excellent tools for building AI agents. But they were built before the EU AI Act existed, and compliance isn't their job. That's what AIR Blackbox is for.

Scan Your Own Code

Run it in 10 seconds:

pip install air-compliance
air-compliance scan your_agent.py

Or add trust layers for your specific framework:

pip install air-langchain-trust    # LangChain
pip install air-crewai-trust       # CrewAI
pip install air-adk-trust          # Google ADK
pip install air-openai-trust       # OpenAI SDK
pip install air-anthropic-trust    # Anthropic Claude SDK

Everything runs locally. Your code never leaves your machine. Apache 2.0 open source.

129 days until the EU AI Act deadline.

The scan takes 10 seconds. The fines start at €15M or 3% of global annual turnover.

pip install air-compliance

GitHub · Demo · PyPI