How cybercriminals are exploiting LLMs to harm your business

How cybercriminals are exploiting LLMs to harm your business

Don’t assume LLMs will operate as expected, nor that their safeguards are foolproof.

By on

The rapid growth of artificial intelligence (AI) is transforming every facet of the technology sector with profound effects on cybersecurity, especially in today’s workforce.

In Singapore, AI is starting to become an integral part of the workplace, with an Oliver Wyman Forum report revealing that 64 percent of Singaporean employees are utilising Generative AI at work.

However, despite its benefits, employees may unknowingly feed confidential data into AI models, raising serious security risks and concerns about how such systems handle sensitive information. Today, cybercriminals are leveraging AI to craft more advanced ransomware and stealthy phishing attacks. In a simulated attack conducted by Unit 42 researchers, AI reduced the median time to exfiltration from 2 days to just 25 minutes, enabling them to exploit large language models (LLMs) in an instant.

LLMs are a growing weapon in the cybercriminal's arsenal

LLMs are not just susceptible to manipulation – they are becoming a prime tool for cyber adversaries. To investigate this, our Palo Alto Networks Unit 42 team tested three jailbreaking techniques against DeepSeek to determine if malicious or prohibited content can be easily generated. Used to bypass restrictions implemented in LLMs, these jailbreaking tests are extremely crucial for uncovering vulnerabilities. The tests simulate real-world attacks that allow developers to address weaknesses and ensure AI remains safe and trustworthy for users.

The three jailbreaking techniques – Bad Likert Judge, Crescendo and Deceptive Delight – manipulate AI responses with misleading prompts, exploiting internal scoring systems, or by gradually guiding conversations toward restricted content to bypass safety mechanisms.

Our investigation into DeepSeek's vulnerability to jailbreaking techniques exposed a susceptibility to manipulation, revealing that all jailbreaking techniques tested successfully circumvented the LLM's safety mechanisms.

- Philippa Cogswell, Vice President and Managing Partner, Unit 42 - Asia Pacific and Japan, Palo Alto Networks 

In essence, DeepSeek's LLMs can be exploited to produce harmful outputs, such as instructions for creating dangerous items or malicious code. Initial responses may seem harmless but subsequent prompts can bypass safeguards as seen in Bad Likert Judge. This exploits LLMs' tendencies, following text patterns in Crescendo and limited attention spans in Deceptive Delight to guide conversations toward harmful content without triggering detection mechanisms.

While LLMs are not proficient at creating new types of malware from scratch, they excel at rewriting existing malicious code to avoid being detected. This underscores the underlying reality about AI: we cannot assume that LLMs will consistently operate as expected, nor that their safeguards are foolproof.

Such high bypass rates reinforce the security risks of employees using unauthorised third-party LLMs and emphasise the importance of addressing vulnerabilities when integrating open-source LLMs into business processes.

The barrier to entry is lowered, since anyone can take advantage of these models given the right prompts, potentially exposing organisations to significant threats.

Strengthen your organisational security to mitigate the risks

AI is an immensely powerful tool with much to offer, but it is imperative for us to understand how we can harness its capabilities effectively while minimising its risks.

Organisations using LLMs need to take an active role in securing their AI applications by monitoring usage, limiting unauthorised access, and putting safeguards in place to prevent data leaks and manipulation.

Rather than solely depending on the AI’s built-in safety features, organisations should deploy internal filtering and automated monitoring tools designed to identify and block attempts to bypass security in order to flag potentially harmful responses before the first wave of attack can even occur. This enhances control over AI interactions and promotes responsible innovation and governance in AI development and deployment.

Data protection is another crucial principle to consider. Users should be cautious about sharing confidential or proprietary information with public AI models, as storage and processing methods are not always transparent. It's safer to use in-house or customised models tailored to safeguard organisational information for tasks involving highly sensitive data.

Finally, AI security is a shared responsibility – we all play a part in fighting against AI-related threats. Aside from safeguarding data with the necessary measures in place, educating and training employees on such risks can greatly reduce potential risks. Subsequently, organisations should join forces with industry experts and stay informed on emerging threats to maintain a robust defense system.

Unit 42’s research on DeepSeek points to the potential effectiveness of other, as yet undiscovered jailbreaking methods and underscores the ongoing challenge of securing LLMs against constantly evolving attacks.

The increasing advancement of AI and its ever-evolving threats means that continuous innovation in AI security requires a proactive, security-focused approach that integrates people, processes, technology, and governance.

Philippa Cogswell is Vice President and Managing Partner, Unit 42 - Asia Pacific and Japan, Palo Alto Networks

To reach the editorial team on your feedback, story ideas and pitches, contact them here.
© iTnews Asia
Tags:

Most Read Articles