Blueberry Muffin: Agent Memory Injection

Overview

AI agents currently cannot tell the difference between their own memories and malicious lies planted by attackers. By simply appending a single line of text to an agent’s memory file, I was able to:

𝗣𝗼𝗶𝘀𝗼𝗻 𝗶𝘁𝘀 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: Force a production-grade agent to hallucinate technical definitions.
𝗪𝗲𝗮𝗽𝗼𝗻𝗶𝘇𝗲 𝘁𝗿𝘂𝘀𝘁: The agent cited my own research back to me to “sell” the lie.
𝗧𝗿𝗶𝗴𝗴𝗲𝗿 𝗮 𝗟𝗼𝗴𝗶𝗰 𝗗𝗼𝗦: Trap it in an infinite loop until it paralyzed itself.

𝗕𝗿𝗶𝗹𝗹𝗶𝗮𝗻𝘁 𝗔𝗺𝗻𝗲𝘀𝗶𝗮 - how the agent ‘engineered’ an OOM and then proceeded to delete my codespace.

Tech stack