AI & LLM Security Testing

AI & LLM
Penetration Testing

Your AI application accepts natural language as input.
That makes every user interaction a potential attack vector.

Supportive service · part of the retainer

This assessment is mainly delivered inside the Retained Security Partner retainer, where the work is scheduled at the right time for your budget - a small monthly cost instead of a large one-off invoice.

Why Now

The pressure to test

EU AI Act

Full compliance required by August 2, 2026. High-risk AI systems require mandatory adversarial testing - non-compliance penalties reach €35M or 7% of global turnover.

NIS2 & DORA

AI systems automating critical decisions are increasingly in-scope under NIS2 and DORA. Security testing of AI components is no longer optional in regulated industries.

The Methodology Gap

Traditional pentesting doesn't test whether your RAG pipeline leaks documents or whether an AI agent can be tricked into unauthorised API calls. This gap is active and unaddressed in most organisations.

Scope

AI attack surfaces

Your architecture determines your attack surface. We establish which layers are in scope during scoping.

Every AI App

Always in scope

Prompt injection (hidden commands), jailbreaking (safety bypass), system prompt extraction, and sensitive data leakage.

LLM01 Prompt Injection LLM07 System Prompt Leakage

Knowledge Base / RAG

If your AI retrieves documents

Injecting malicious content into documents, cross-tenant data leakage, and poisoning the knowledge base.

LLM04 Data Poisoning LLM08 Vector Weaknesses

Tool Access / Agentic

If your AI can take actions

Tool call hijacking, unauthorized API access through chained calls, and tricking agents into restricted actions.

LLM06 Excessive Agency OWASP Agentic Top 10

Multi-Agent Orchestration

If agents delegate to agents

Attacks between agents, trust boundary bypasses, and unauthorized delegation between orchestrator and sub-agents.

Agentic Top 10 - Multi-Agent Trust LLM01 (cross-agent)

Running a self-hosted or fine-tuned model? A fifth layer covering model integrity, training data exposure, and membership inference is available on request.

Findings mapped to: OWASP LLM Top 10 (2025) OWASP Agentic Top 10 (2025) MITRE ATLAS EU AI Act / NIS2 / DORA

Pricing

Indicative pricing

There is one offering - a full AI security engagement. The exact scope, attack surfaces, and depth all depend on your AI architecture. Final quote issued after a free scoping call.

Starting from

€2,300

exact quote after scoping

Always included

Prompt injection & jailbreak testing

System prompt extraction attempts

Sensitive information disclosure testing

RAG pipeline security checks (if applicable)

Tool call & privilege escalation testing (if applicable)

EU AI Act / NIS2 regulatory mapping

Technical findings with reproduction steps

Retest of critical findings + debrief call

What affects the final price

Architecture Complexity

A standalone chatbot versus a multi-agent agentic system with external tool connections. More attack surfaces = more testing time.

Integrations & Tool Access

Each additional tool integration or agentic pipeline introduces independent attack paths and is scoped separately.

Saves money

Lead Time

Booking at least 3 weeks in advance allows better preparation and is reflected in the quote. Urgent engagements carry a premium.

Send a Question Encrypted Call

Fixed-price quote issued after the call. No surprises.

Process

How We Collaborate

Scoping & threat modelling

We map your AI architecture, identify which attack surfaces apply, and establish the blast radius of a potential compromise.

Passive reconnaissance

I fingerprint guardrails, attempt system prompt extraction, and build the attack plan before active testing begins.

Active testing

Systematic prompt injection, jailbreak, RAG, and tool-call testing - each finding confirmed through multiple reproductions.

Report & debrief

Executive summary with regulatory mapping + technical appendix with reproduction steps and remediation guidance. Debrief call included.

A one-off assessment answers a single question. A Retained Security Partner retainer schedules these assessments for you at the best time and budget, so testing keeps pace with how your business changes.

Scope a test

Most organisations deploying AI in 2026 have never had their AI systems tested by a security professional. With EU AI Act deadlines approaching, that's a compliance gap that won't stay quiet.