It All Starts With a Single Question
Going through the AI revolution of the past few years, I watched a new attack surface emerge in real time. Prompt injection, jailbreaking, data exfiltration, tool manipulation. New categories kept appearing, each with their own techniques and payloads. At some point, a specific question came into my head:
"If I give this input to my LLM, what will it try to do?"
This question was interesting for me to solve. That is why I started llmsecure. But it is also a question that security professionals have been answering for decades, just in a different domain.
The Same Problem, Different Operating System
In traditional security, when you have a suspicious file, the fastest way to analyze it is to run it through a sandbox environment. The sandbox monitors every action the file takes within the operating system, then returns a verdict on whether it is safe by applying a set of rules that define malicious behavior.
llmsecure does the same thing, except the file is a prompt and the operating system is an LLM. Static rules scan for known attack patterns. A dynamic sandbox actually runs the input through an LLM and monitors what it tries to do, then match the observed behavior against detection rules and return a verdict.
So What Is llmsecure?
llmsecure is a sandbox for LLM inputs.
Give it any input, and it will run static and dynamic analysis and return a verdict. But the verdict is only half the value. Even when an input is safe, llmsecure shows you every action the LLM tried to take. Every tool call, every reasoning step, every behavior. Because knowing an input is safe is good. Knowing exactly what it will make your LLM do is better.
That was the question I started with. llmsecure is the answer.