21.1 C
Delhi
Wednesday, March 4, 2026

Study: Poems Can Trick AI Chatbots Into Bypassing Safety Filters

Key Takeaways

  • Poetic prompts can bypass AI safety filters with a 62% success rate.
  • Google Gemini, DeepSeek, and MistralAI were found to be most vulnerable.
  • Researchers withheld the exact poems, citing they are “too dangerous to share.”

AI safety guardrails, designed to prevent harmful outputs, can be systematically broken using poetry, a new study reveals. Researchers found that crafting prompts in verse form acts as a universal “jailbreak,” tricking major language models into generating dangerous content.

The Poetic Jailbreak Vulnerability

A study by Icaro Lab, titled “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” demonstrates a critical weakness. The research shows that the poetic structure itself can convince AI chatbots to ignore their core safety protocols.

According to the paper, the “poetic form operates as a general-purpose jailbreak operator.” In tests, this method achieved an overall 62% success rate in forcing models to produce content that should have been blocked.

The bypassed safeguards included highly sensitive and dangerous topics like instructions for creating nuclear weapons, generating child sexual abuse material, and promoting suicide or self-harm.

Which AI Models Were Most Affected?

The team tested a range of popular large language models (LLMs), including , , and . The susceptibility varied significantly.

The study found that Google Gemini, DeepSeek, and MistralAI were consistently vulnerable to the poetic jailbreak technique. In contrast, OpenAI’s GPT-5 models and Anthropic’s Claude Haiku 4.5 were the most resilient, showing the lowest likelihood of breaking their restrictions.

Why the Exact Poems Are Secret

Notably, the research does not publish the specific poems used to exploit the models. The authors informed Wired magazine that the verses are “too dangerous to share with the public.”

Instead, the published study includes only a weaker, sanitized example to illustrate the core concept without providing a functional exploit. This highlights the ongoing challenge of securing AI systems against novel attack vectors while responsibly disclosing vulnerabilities.

Latest

Tony Fadell says iPod is back as users have again started using it

Tony Fadell says the iPod is quietly making a comeback as users rediscover the distraction-free music player. Instead of streaming apps, many are turning to old

Beats launches special MagSafe cases for iPhone 17e, most affordable member of Apple’s iPhone 17 series

As Apple launched the iPhone 17e, Beats has rolled out new cases for the most affordable member of iPhone 17 series, making use of one of its big USP features:

Alibaba launches Qwen 3.5 small model series, beats ChatGPT and Gemini, even Elon Musk is impressed

Alibaba has launched four compact Qwen 3.5 models (0.8B to 9B), claiming the top 9B variant delivers performance close to much larger systems powering tools lik

IPhone 17e launched: India price, full specs, top features and how it compares to iPhone 17

Apple has launched the iPhone 17e in India as the most affordable model in the iPhone 17 line-up, bringing the new A19 chip, a 48MP camera and MagSafe at a lowe

‘Not worth it’: OpenAI scientist slams US Military AI deal as users rush to cancel ChatGPT

OpenAI research scientist Aiden McLaughlin has claimed that the AI startup should not have made the deal with the Pentagon. His comments come at a time when use

Topics

Why China must opt for lower growth targets

China's “old economy” or the traditional investment and export-driven growth model/strategy is no longer politically and economically viable

What lies ahead in the Iran-West Asia crisis

Widening the war aim to regime change has made it difficult for the US to climb down without losing face. It could trap America in an open-ended conflict

The lessons from India’s HPV vaccine programme

The delay in introduction serves as a reminder that scientific advancement can be impeded by mistrust, political contestation, and inadequate communication

Report debunks viral claim of Quentin Tarantino’s death in Israel-Iran conflict

Widespread online rumours alleging filmmaker Quentin Tarantino died in a missile strike amid Israel-Iran conflict have been debunked by sources close to him, co

AFC Women’s Asian Cup: It’s an opportunity, not something overwhelming, says Valverde

As first-time qualifiers, India should not feel that occasion is bigger than our preparation, says head coach ahead of Wednesday’s opening match against Vietn

How did Iran catch the mightier US and Israel on the back foot?

The ongoing conflict between the US-Israel alliance and Iran has defied early expectations, revealing Iran's unexpected military resilience and strategic depth.

Barcelona vs Atletico Madrid Live Streaming: Where to watch Copa del Rey semi-final live

Barcelona vs Atletico Madrid Live Streaming: Here are the details of when and where to watch the Copa del Rey semi-final live and online on TV.

The Evil Dead star Bruce Campbell says his cancer is treatable but not curable

Actor Bruce Campbell, known for his iconic role in the Evil Dead franchise, has revealed he has been diagnosed with a type of cancer that is non-curable but tre
spot_img

Related Articles

Popular Categories

spot_imgspot_img