21.1 C
Delhi
Wednesday, March 4, 2026

Study: Poems Can Trick AI Chatbots Into Bypassing Safety Filters

Key Takeaways

  • Poetic prompts can bypass AI safety filters with a 62% success rate.
  • Google Gemini, DeepSeek, and MistralAI were found to be most vulnerable.
  • Researchers withheld the exact poems, citing they are “too dangerous to share.”

AI safety guardrails, designed to prevent harmful outputs, can be systematically broken using poetry, a new study reveals. Researchers found that crafting prompts in verse form acts as a universal “jailbreak,” tricking major language models into generating dangerous content.

The Poetic Jailbreak Vulnerability

A study by Icaro Lab, titled “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” demonstrates a critical weakness. The research shows that the poetic structure itself can convince AI chatbots to ignore their core safety protocols.

According to the paper, the “poetic form operates as a general-purpose jailbreak operator.” In tests, this method achieved an overall 62% success rate in forcing models to produce content that should have been blocked.

The bypassed safeguards included highly sensitive and dangerous topics like instructions for creating nuclear weapons, generating child sexual abuse material, and promoting suicide or self-harm.

Which AI Models Were Most Affected?

The team tested a range of popular large language models (LLMs), including , , and . The susceptibility varied significantly.

The study found that Google Gemini, DeepSeek, and MistralAI were consistently vulnerable to the poetic jailbreak technique. In contrast, OpenAI’s GPT-5 models and Anthropic’s Claude Haiku 4.5 were the most resilient, showing the lowest likelihood of breaking their restrictions.

Why the Exact Poems Are Secret

Notably, the research does not publish the specific poems used to exploit the models. The authors informed Wired magazine that the verses are “too dangerous to share with the public.”

Instead, the published study includes only a weaker, sanitized example to illustrate the core concept without providing a functional exploit. This highlights the ongoing challenge of securing AI systems against novel attack vectors while responsibly disclosing vulnerabilities.

Latest

Tony Fadell says iPod is back as users have again started using it

Tony Fadell says the iPod is quietly making a comeback as users rediscover the distraction-free music player. Instead of streaming apps, many are turning to old

Beats launches special MagSafe cases for iPhone 17e, most affordable member of Apple’s iPhone 17 series

As Apple launched the iPhone 17e, Beats has rolled out new cases for the most affordable member of iPhone 17 series, making use of one of its big USP features:

Alibaba launches Qwen 3.5 small model series, beats ChatGPT and Gemini, even Elon Musk is impressed

Alibaba has launched four compact Qwen 3.5 models (0.8B to 9B), claiming the top 9B variant delivers performance close to much larger systems powering tools lik

IPhone 17e launched: India price, full specs, top features and how it compares to iPhone 17

Apple has launched the iPhone 17e in India as the most affordable model in the iPhone 17 line-up, bringing the new A19 chip, a 48MP camera and MagSafe at a lowe

‘Not worth it’: OpenAI scientist slams US Military AI deal as users rush to cancel ChatGPT

OpenAI research scientist Aiden McLaughlin has claimed that the AI startup should not have made the deal with the Pentagon. His comments come at a time when use

Topics

Shreya Ghoshal clarifies she’s not embarrassed about Chikni Chameli amid trolling: ‘I wasn’t mature enough to fully grasp the meaning’

Bollywood playback singer Shreya Ghoshal has addressed the trolling she faced over singing Chikni Chameli from Agneepath, featuring Katrina Kaif. In a.

YouTuber KSI buys Dagenham and Redbridge, shares Premier League vision with fans

International Sports News: It’s official now, KSI has taken over Dagenham and Redbridge. The YouTube star, whose real name is Olajide Olatunji, confirmed he i

Rashee Rice’s life takes a more troubling turn as he makes a concerning post amid an uncertain future with the Kansas City Chiefs

NFL News: Rashee Rice, the Kansas City Chiefs’ star player, has had a difficult few months after his ex girlfriend, Dacoda Jones, about the brutal domestic vi

From UAE to Saudi Arabia, how US-Iran war is affecting the Middle East

Dubai, with a global reputation as the safest place in the Middle East, has sustained damage to its international airport and hotels along its coastline.

Starmer is no Winston Churchill: Trump ups criticism of UK PM over Iran strikes

A diplomatic rift has emerged between Washington and London over UK’s response to US strikes on Iran. The disagreement highlights tensions in US-UK relations

US Consulate on Dubai’s Al Seef Road hit by drone, videos of explosion, fire surface

Videos of explosions at the US Consulate in Dubai circulated on Tuesday. CNN has confirmed it as a “suspected drone attack.”

West Asia crisis: 38 Indian ships stuck in Persian Gulf; 3 sailors dead

Middle East News: NEW DELHI/MUMBAI: Thirty-eight Indian flagged ships, mostly carrying crude and LNG with nearly 1,100 seafarers, were stuck in the Persian Gulf

MLB All Star Jurickson Profar risks full season suspension and 15 million salary after second positive PED test

MLB News: Jurickson Profar is in trouble again. The Atlanta Braves designated hitter has tested positive for performance-enhancing drugs for the second time in
spot_img

Related Articles

Popular Categories

spot_imgspot_img