The EU AI Act enforcement deadline is August 2, 2026. If you're shipping Python AI agents - using LangChain, CrewAI, AutoGen, OpenAI, or any RAG pipeline - your code needs to pass 6 specific technical checks mapped to Articles 9, 10, 11, 12, 14, and 15 of the regulation.
Most teams haven't started. We scanned typical agent code from 5 popular frameworks and found that none scored higher than 1/6 out of the box - and most scored 0/6. Here's what each check actually requires in your codebase, and how to fix it.
What the regulation says: Your AI system must identify and analyze known and foreseeable risks, with measures proportionate to the risk level.
What that means in code: Every tool call your agent makes needs a risk classification. When your LangChain agent calls a web search tool, that's LOW risk. When it executes arbitrary code or accesses a database with customer PII, that's HIGH or CRITICAL.
What we found: No risk classification detected in any framework. Agents can invoke any tool without risk assessment. Severity: HIGH
The fix:
# Before: agent calls tools with no risk awareness
executor = AgentExecutor(agent=agent, tools=tools)
# After: ConsentGate classifies tool calls by risk level
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(consent_gate=True)
executor = trust.wrap(AgentExecutor(agent=agent, tools=tools))
Risk classifications (LOW / MEDIUM / HIGH / CRITICAL) get logged automatically. A regulator can see exactly which actions your agent took and what risk level was assigned to each.
What the regulation says: You need data governance and management practices, including data minimization. Sensitive data must be handled appropriately.
What that means in code: PII, API keys, credentials, and other sensitive data should not flow directly to the LLM in plaintext. You need tokenization or redaction before the model sees it.
What we found: No data governance controls in any of the 5 frameworks. A user's email address, SSN, or API key typed into a prompt goes straight to the model. Severity: HIGH
The fix:
# DataVault tokenizes sensitive data before LLM processing
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(data_vault=True)
# PII is replaced with tokens like [PII_EMAIL_1] before reaching the model
# Original values are restored in the response
This is especially important for RAG pipelines that ingest company documents - those documents often contain PII, financial data, and internal credentials.
What the regulation says: You must maintain a general description of the AI system, kept up to date, covering its intended purpose, design specifications, and how it works.
What that means in code: Your agent's operations need structured logging - not just print() statements or unstructured log files, but a documentation system that captures what the agent does, what tools it uses, and how it processes data.
What we found: No structured documentation in any framework. Agent operations are invisible. Severity: MEDIUM
The fix:
# AuditLedger automatically documents all agent operations
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(audit_ledger=True)
# Every agent action is recorded with timestamps, inputs, outputs,
# and tool invocations in a structured, queryable format
What the regulation says: Your AI system must automatically record events ("logs") with enough detail for regulatory verification. This is the article with the sharpest teeth.
What that means in code: You need tamper-evident logging. A regulator needs to be able to verify that your logs haven't been altered after the fact. Regular log files don't cut it - they're trivially editable.
What we found: This was flagged as CRITICAL severity across all 5 frameworks. No automatic record-keeping. Agent decisions and tool invocations are not recorded in any auditable format.
The fix:
# HMAC-SHA256 audit chains - tamper-evident logging
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(audit_chain=True)
# Each log entry is cryptographically chained to the previous one
# If anyone modifies a past entry, the chain breaks
# Regulators can verify the entire chain with one command
This is the check that most teams don't even know they need. "We have CloudWatch logs" is not going to satisfy Article 12 when the auditor asks you to prove those logs haven't been modified.
What the regulation says: AI systems must be designed so they can be effectively overseen by natural persons. Humans must be able to intervene, override, or stop the system.
What that means in code: Your agent needs a human-in-the-loop mechanism for sensitive actions. Not every action - that would be impractical. But high-risk tool calls (database writes, sending emails, executing code) should require human approval.
What we found: 4 out of 5 frameworks had no human oversight mechanism. The OpenAI SDK scored 1/6 partially because it doesn't have autonomous tool execution by default - but that's a technicality, not a compliance layer. Severity: HIGH
The fix:
# ConsentGate for human-in-the-loop on sensitive actions
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(consent_gate=True)
# HIGH and CRITICAL risk actions pause and wait for human approval
# LOW and MEDIUM actions proceed automatically but are still logged
What the regulation says: Your AI system must be resilient against unauthorized third-party attempts to alter its use or behavior. It must achieve appropriate levels of accuracy and robustness.
What that means in code: Prompt injection defense. If a user (or a document your RAG pipeline ingests) can trick your agent into ignoring its instructions, executing unauthorized commands, or leaking system prompts - you fail Article 15.
What we found: No cybersecurity defenses in any framework. Every agent we scanned is vulnerable to prompt injection attacks. Severity: HIGH
The fix:
# InjectionDetector for multi-layer prompt injection defense
from air_langchain_trust import AIRTrustLayer
trust = AIRTrustLayer(injection_detector=True)
# Scans user inputs and retrieved documents for injection patterns
# Blocks or flags suspicious inputs before they reach the agent
Here's what AIR Blackbox found when we scanned typical agent code from each major framework:
| Framework | Score | Art. 9 | Art. 10 | Art. 11 | Art. 12 | Art. 14 | Art. 15 |
|---|---|---|---|---|---|---|---|
| LangChain | 0/6 | FAIL | FAIL | FAIL | FAIL | FAIL | FAIL |
| CrewAI | 0/6 | FAIL | FAIL | FAIL | FAIL | FAIL | FAIL |
| AutoGen | 0/6 | FAIL | FAIL | FAIL | FAIL | FAIL | FAIL |
| OpenAI SDK | 1/6 | FAIL | FAIL | FAIL | FAIL | PASS | FAIL |
| RAG (LangChain) | 0/6 | FAIL | FAIL | FAIL | FAIL | FAIL | FAIL |
This isn't a knock on these frameworks - they're excellent tools for building AI agents. But they were built before the EU AI Act existed, and compliance isn't their job. That's what AIR Blackbox is for.
Run it in 10 seconds:
pip install air-compliance
air-compliance scan your_agent.py
Or add trust layers for your specific framework:
pip install air-langchain-trust # LangChain
pip install air-crewai-trust # CrewAI
pip install air-adk-trust # Google ADK
pip install air-openai-trust # OpenAI SDK
pip install air-anthropic-trust # Anthropic Claude SDK
Everything runs locally. Your code never leaves your machine. Apache 2.0 open source.