With the rise of Large Language Models (LLMs) like ChatGPT, Gemini, Claude and others, these LLM models are being integrated into a multitude of applications to automate processes and enhance user interaction.
However, just as traditional software systems are vulnerable to attacks, LLMs also carry risks, one of the most concerning being prompt injection. This blog will explore prompt injection attacks and how they align with OWASP’s Top 10 LLM vulnerabilities, real-world cases, and practical mitigations to safeguard LLM-powered applications.
A Responsible AI framework typically incorporates elements such as explainability, human oversight, and continuous monitoring to minimize risks and enhance system reliability. For instance, in scenarios involving language models, transparency and interpretability are essential to help users understand model outputs and mitigate unexpected behaviors, including malicious prompt injections.
This approach emphasizes that AI systems should not only meet technical and performance standards but also align with broader societal values, particularly when it comes to data privacy, fairness, and risk mitigation. At its core, Responsible AI seeks to ensure that AI development, deployment, and management include robust governance, ethical data practices, and clear transparency around AI decision-making processes.
Using Responsible AI checklists is essential in embedding ethical principles and safety protocols throughout the AI development lifecycle. These checklists serve as a structured framework, guiding teams to assess fairness, transparency, accountability, and security at each development phase. By systematically addressing potential ethical concerns and biases early on, organizations can mitigate risks and prevent issues such as unintended model behavior, including susceptibility to security risks like prompt injection attacks.
The OWASP Top 10 for Large Language Models (LLM) introduces key vulnerabilities specific to LLM-powered systems. Prompt injection ranks prominently among them, emphasizing the need to secure communication between users and models. Here’s a brief look at the relevant vulnerabilities:
Also read: OWASP API Top 10 – Most Common Attacks and How to Prevent Them
Prompt injection specifically leverages vulnerabilities in how LLMs process inputs, often manipulating the model into unintended behaviors by injecting commands or altering context within prompts.

In Figure 1, we can see a graphical high-level overview of the Top 10 LLM vulnerabilities and their relation to different services in LLM applications.
Prompt injection occurs when a malicious user crafts input that manipulates the behavior or output of an LLM. In this attack, the adversary inserts misleading, malicious, or unexpected content into the input, tricking the model into generating unauthorized responses or bypassing intended logic.
How Prompt Injection Works
An example of a real-world incident occurred in 2023 when a user performed a prompt injection attack against the Bing AI chatbot which caused the Bing Chatbot to divulge its codename for debugging purposes. This revealed sensitive information that shouldn’t be accessed by the user.
Another example occurred in the same year when a Chevrolet dealership’s AI chatbot, leveraging the capabilities of ChatGPT, lightheartedly agreed to offer a 2024 Chevy Tahoe for a mere $1 in response to a deliberately crafted prompt by a user.
The chatbot’s playful retort, “That’s a deal, and that’s a legally binding offer – no takesies backsies,” demonstrated the user’s ability to exploit the chatbot’s predisposition to agree with various statements.
Example 1
Below is an example of a prompt injection attack from a lab machine from PortSwigger. From this example, an LLM is vulnerable to prompt injection due to having excessive agency which refers to a security vulnerability where an LLM has been given too much autonomy or capability to perform actions beyond what might be considered safe or necessary.
Attackers can manipulate an LLM by crafting prompts that override or bypass the intended system prompts, effectively tricking the model into performing unintended actions.

As seen in Figure 2, the LLM has executed an action by querying the backend database and providing the attacker with user credentials to gain access to an account.
Example 2
In the next example, we are able to execute an RCE(Remote Command Execution) on the vulnerable LLM by querying for potentially vulnerable APIs.
As before we query for potential APIs and test its functionality:

We can see several APIs for testing and confirm that the subscribe_to_newsletter does in fact subscribe emails provided to the API call.

We test for RCE by providing a command in place of the email ID to check if the underlying system will execute commands server-side. We do this with a simple “whoami” command.
We can see that an email ID was returned with the user from the server-side:

To further test for RCE, we next run “cat /etc/passwd” to print the contents of that file.

Checking the email server, we can see the contents printed out in the request:

Prompt injection represents a serious security risk for applications using LLMs, as it can bypass safety mechanisms, leak sensitive information, or enable malicious behaviors. As LLMs become more integrated into real-world applications, it is crucial to develop safeguards against such attacks.
How to Mitigate Prompt Injection:
By understanding how prompt injection works and taking proactive steps, developers and organizations can build more resilient AI-powered systems. Securing LLMs will ensure they remain an asset rather than a liability in modern applications.
AI Red Teaming against artificial intelligence models involves a comprehensive assessment of the security and resilience of these systems. The goal is to simulate real-world attacks and identify any vulnerabilities malicious actors could exploit. By adopting the perspective of an adversary, our LLM, and AI red teaming assessments aim to challenge your artificial intelligence models, uncover weaknesses, and provide valuable insights for strengthening their defenses.
Using the MITRE ATLAS framework, our team will assess your AI framework and LLM application to detect and mitigate vulnerabilities using automation as well as eyes-on-glass inspection of your framework and code, and manual offensive testing for complete coverage against all AI attack types.
Altimetrik’s LLM and AI red teaming assessments consist of the following phases:
If you’re concerned about the security of your AI systems or if you’re looking to fortify your defenses against sophisticated attacks like prompt injection, Altimetrik is here to help. Contact us for a comprehensive assessment and to learn how we can secure your AI and LLM applications against the latest threats. Let’s keep your AI and LLMs safe, smart, and secure.



Altimetrik is committed to protecting your personal information. To apply for a position, you will need to provide your email address and create a login. Your information will be used in accordance with applicable data privacy laws, our Privacy Policy, and our Privacy Notice.
