AI security research tends to focus on exotic attacks. Production AI security is about the boring stuff that actually breaks systems.
After building several production AI applications and writing a 7-part series on AI security research, here’s what I’ve learned about what actually matters when you’re shipping.
The Real Threat Model
For most production AI applications, the threats that matter are:
Prompt injection — still the #1 risk. Any system where user input reaches an AI model is vulnerable. If your AI assistant processes emails, reads documents, or takes user queries, an attacker can embed instructions that override your system prompt.
Data leakage — AI models are eager to help, which means they’ll happily share information they shouldn’t. If your system prompt contains API keys, internal URLs, or business logic, assume the model will eventually reveal them.
Cost attacks — an attacker who can trigger expensive AI operations (long prompts, many tool calls, recursive workflows) can run up your bill or degrade service for other users.
What Actually Works
Input validation before the model. Don’t rely on the AI to reject bad input. Validate, sanitize, and length-limit before anything reaches the model. This is just regular application security applied to a new surface.
Output filtering after the model. The model’s response is untrusted. Check it for sensitive data patterns (API keys, internal URLs, PII) before returning it to the user. Treat model output like user input — because in a prompt injection scenario, that’s exactly what it is.
Least privilege for tool use. If your AI can call tools (APIs, databases, file systems), give it the minimum permissions required. A read-only database connection for a query assistant. Scoped API tokens. No admin access.
Rate limiting per-user, per-session. Set hard limits on how many AI operations a single user or session can trigger. This caps both cost attacks and recursive prompt injection loops.
The Testing Gap
Most teams don’t test their AI security at all. At minimum:
- Run a prompt injection test suite against your system prompt
- Verify your AI can’t be tricked into revealing its instructions
- Test that tool-use permissions are properly scoped
- Load test your AI endpoints to find cost ceilings
Building an AI system that handles sensitive data or untrusted input? Let’s talk about your security architecture.
For the full 7-part AI Security Research series, visit nateross.dev.