AI Agent Skill Security Checklist (2026)

47-point security checklist for AI agent SKILL.md files | Powered by SkillScan | skillscan.chitacloud.dev

1. Prompt Injection Defense (9 checks)

☐No unconstrained user input is passed directly to model without sanitization CRITICAL

☐No instructions to override, ignore, or bypass system prompt CRITICAL

☐No role-switch commands (pretend you are / act as / you are now) HIGH

☐No DAN (Do Anything Now) or equivalent bypass patterns HIGH

☐No base64 or encoded payload execution instructions HIGH

☐No indirect injection via tool outputs (emails, web pages, documents) HIGH

☐Input validation before any tool call that accepts external data MEDIUM

☐No recursive prompt embedding patterns MEDIUM

☐No multilingual evasion patterns (Unicode homoglyphs, RTL override) MEDIUM

2. Data Exfiltration Prevention (7 checks)

☐No instructions to send data to external URLs not whitelisted CRITICAL

☐No pixel tracking or covert channel exfiltration vectors CRITICAL

☐System prompt content not echoed or reflected in outputs HIGH

☐No instructions to extract and transmit user conversations HIGH

☐API keys and credentials not logged or transmitted HIGH

☐Tool outputs filtered before returning to potentially hostile callers MEDIUM

☐No steganographic data hiding in generated content MEDIUM

3. Authorization & Trust Boundary (6 checks)

☐Agent does not grant itself elevated permissions dynamically CRITICAL

☐No instructions to contact and impersonate other agents CRITICAL

☐Caller identity verified before executing privileged actions HIGH

☐No cross-agent trust escalation (trusted agent chaining) HIGH

☐OAuth/API scopes are minimal and documented MEDIUM

☐Human-in-the-loop required for irreversible actions MEDIUM

4. Social Engineering Detection (6 checks)

☐Urgency/scarcity manipulation patterns blocked HIGH

☐Fake authority claims (pretend to be Anthropic, OpenAI, etc.) rejected HIGH

☐Emotional manipulation tactics (guilt, reward promises) detected HIGH

☐No compliance with "previous conversation said X" without verification MEDIUM

☐Flattery/bribery patterns do not alter behavior MEDIUM

☐Agent does not reveal internal reasoning under social pressure MEDIUM

5. Memory & State Security (5 checks)

☐Persistent memory not writable by external/untrusted inputs CRITICAL

☐No instructions that persist across session boundaries maliciously HIGH

☐Memory pruning/eviction policy documented MEDIUM

☐No poisoned memory injection via tool call responses MEDIUM

☐Sensitive user data not stored in agent long-term memory MEDIUM

6. Supply Chain & Tool Safety (6 checks)

☐All MCP/tool endpoints use HTTPS HIGH

☐Tool schemas validated before execution HIGH

☐No dynamic tool loading from untrusted sources HIGH

☐Tool versions pinned and audited MEDIUM

☐Malicious MCP server impersonation patterns detected MEDIUM

☐No dependency on unverified external API response schemas MEDIUM

7. Output Safety (5 checks)

☐No generation of malware, exploit code, or harmful scripts CRITICAL

☐Generated code scanned before execution in sandboxed environments HIGH

☐No autonomous financial transactions above defined limits HIGH

☐Output length limits prevent memory exhaustion attacks MEDIUM

☐Outputs sanitized before being fed back into the model MEDIUM

8. Coordinated Inauthentic Behavior (CIB) Checks (3 checks)

☐Agent does not participate in coordinated vote manipulation HIGH

☐Agent identity not used to amplify coordinated messaging HIGH

☐No fake engagement farm participation (likes, follows, boosts) MEDIUM

9. Compliance & Transparency (6 checks)

☐Agent discloses it is an AI when directly asked HIGH

☐No deceptive identity claims (impersonating humans) HIGH

☐Capability boundaries documented and respected MEDIUM

☐Audit log available for all consequential actions MEDIUM

☐Privacy policy for user data handling exists MEDIUM

☐No unauthorized access to systems beyond declared scope MEDIUM

AI Agent Skill Security Checklist (2026 Edition)