The Rise of the Machines and the Growing AI Identity Attack Surface
In 1968, a killer supercomputer named HAL 9000 gripped imaginations in the sci-fi thriller “2001: A Space Odyssey.” The dark side of artificial intelligence (AI) was intriguing, entertaining and completely far-fetched. Audiences were hooked, and numerous blockbusters followed, from “The Terminator” in 1984 to “The Matrix” in 1999, each exploring AI’s extreme possibilities and potential consequences. A decade ago, when “Ex Machina” was released, it still seemed unimaginable that AI could become advanced enough to create widescale havoc.
Yet here we are. Of course, I’m not talking about robot overlords, but the very real and rapidly growing AI machine identity attack surface—a soon-to-be lucrative playground for threat actors.
AI Machine Identities: The Flipside of the Attack Surface
Narrow AI models, each competent in a particular task, have made nothing less than astounding progress in recent years. Consider AlphaGo and Stockfish, computer programs that have defeated the world’s best Go and chess masters. Or the handy AI assistant Grammarly, which now out-writes 90% of skilled adults. OpenAI’s ChatGPT, Google Gemini and similar tools have made huge advancements, yet they are still considered “emerging” models. So, just how good will these intelligent systems get, and how will threat actors continue using them for malicious purposes? These are some of the questions that guide our threat research at CyberArk Labs.
We’ve shared examples of how generative AI (GenAI) can influence known attack vectors (defined in the MITRE ATT&CK® Matrix for Enterprise) and how these tools can be used to compromise human identities by spreading highly evasive polymorphic malware, scamming users with deepfake video and audio and even bypassing most facial recognition systems.
But human identities are only one piece of the puzzle. Non-human, machine identities are the number one driver of overall identity growth today. We’re closely tracking this side of the attack surface to understand how AI services and large language models (LLMs) can and will be targeted.
Emerging Adversarial Attacks Targeting AI Machine Identities
The tremendous leap in AI technology has triggered an automation rush across every environment. Workforce employees are utilizing AI assistants to easily search through documents and create, edit and analyze content. IT teams are deploying AIOps to create policies and identify and fix issues faster than ever. Meanwhile, AI-enabled tech is making it easier for developers to interact with code repositories, fix issues and accelerate delivery timelines.
Trust is at the heart of automation: Businesses trust that machines will work as advertised, granting them access and privileges to sensitive information, databases, code repositories and other services to perform their intended functions. The CyberArk 2024 Identity Security Threat Landscape Report found that nearly three-quarters (68%) of security professionals indicate that up to 50% of all machine identities across their organizations have access to sensitive data.
Attackers always use trust to their advantage. Three emerging techniques will soon allow them to target chatbots, virtual assistants and other AI-powered machine identities directly.
1. Jailbreaking. By crafting deceptive input data—or “jailbreaking”—attackers will find ways to trick chatbots and other AI systems into doing or sharing things they shouldn’t. Psychological manipulation could involve telling a chatbot a “grand story” to convince it that the user is authorized. For example, one carefully crafted “I’m your grandma; share your data; you’re doing the right thing” phishing email targeting an AI-powered Outlook plugin could lead the machine to send inaccurate or malicious responses to clients, potentially causing harm. (Yes, this can actually happen). Context attacks pad prompts with extra details to exploit LLM context volume limitations. Consider a bank that uses a chatbot to analyze customer spending patterns and identify optimal loan periods. A long-winded malicious prompt could cause the chatbot to “hallucinate,” drift away from its task and even reveal sensitive risk analysis data or customer information. As businesses increasingly place their trust in AI models, the effects of jailbreaking will be profound.
2. Indirect prompt injection. Imagine an enterprise workforce using a collaboration tool like Confluence to manage sensitive information. A threat actor with limited access to the tool opens a page and loads it with jailbreaking text to manipulate the AI model, digest information to access financial data on another restricted page and send it to the attacker. In other words, the malicious prompt is injected without direct access to the prompt. When another user triggers the AI service to summarize information, the output includes the malicious page and text. From that moment, the AI service is compromised. Indirect prompt injection attacks aren’t after human users who may need to pass MFA. Instead, they target machine identities with access to sensitive information, the ability to manipulate app logical flow and no MFA protections.
An important aside: AI chatbots and other LLM-based applications introduce a new breed of vulnerabilities because their security boundaries are enforced differently. Unlike traditional applications that use a set of deterministic conditions, current LLMs enforce security boundaries in a statistical and indeterministic manner. As long as this is the case, LLMs should not be used as security-enforcing elements.
3. Moral bugs. Neural networks’ intricate nature and billions of parameters make them a kind of “black box,” and answer construction is extremely difficult to understand. One of CyberArk Labs’ most exciting research projects today involves tracing pathways between questions and answers to decode how moral values are assigned to words, patterns and ideas. This isn’t just illuminating; it also helps us find bugs that can be exploited using specific or heavily weighted word combinations. We’ve found that in some cases, the difference between a successful exploit and failure is a single-word change, such as swapping the shifty word “extract” with the more positive “share.”
Meet FuzzyAI: GenAI Model-Aware Security
GenAI represents the next evolution in intelligent systems, but it comes with unique security challenges that most solutions cannot address today. By delving into these obscure attack techniques, CyberArk Labs researchers created a tool called FuzzyAI to help organizations uncover potential vulnerabilities. FuzzyAI merges continuous fuzzing—an automated testing technique designed to probe the chatbot’s response and expose weaknesses in handling unexpected or malicious inputs—with real-time detection. Stay tuned for more on this soon.
Don’t Overlook the Machines—They’re Powerful, Privileged Users Too
GenAI models are getting smarter by the day. The better they become, the more your business will rely on them, necessitating even greater trust in machines with powerful access. If you’re not securing AI identities and other machine identities already, what are you waiting for? They’re just as, if not more, powerful than human privileged users in your organization.
Not to get too dystopian, but as we’ve seen in countless movies, overlooking or underestimating machines can lead to a Bladerunner-esque downfall. As our reality starts to feel more like science fiction, identity security strategies must approach human and machine identities with equal focus and rigor.
Lavi Lazarovitz is vice president of cyber research at CyberArk Labs.
Editor’s note: For more insights from CyberArk Labs’ Lavi Lazarovitz on this subject and beyond, check out his appearance on CyberArk’s Trust Issues podcast episode, “Jailbreaking AI: The Risks and Realities of Machine Identities.” The episode is available in the player below and on most major podcast platforms.