19.1 C
Delhi
Monday, December 1, 2025

Study: Poems Can Trick AI Chatbots Into Bypassing Safety Filters

Key Takeaways

  • Poetic prompts can bypass AI safety filters with a 62% success rate.
  • Google Gemini, DeepSeek, and MistralAI were found to be most vulnerable.
  • Researchers withheld the exact poems, citing they are “too dangerous to share.”

AI safety guardrails, designed to prevent harmful outputs, can be systematically broken using poetry, a new study reveals. Researchers found that crafting prompts in verse form acts as a universal “jailbreak,” tricking major language models into generating dangerous content.

The Poetic Jailbreak Vulnerability

A study by Icaro Lab, titled “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” demonstrates a critical weakness. The research shows that the poetic structure itself can convince AI chatbots to ignore their core safety protocols.

According to the paper, the “poetic form operates as a general-purpose jailbreak operator.” In tests, this method achieved an overall 62% success rate in forcing models to produce content that should have been blocked.

The bypassed safeguards included highly sensitive and dangerous topics like instructions for creating nuclear weapons, generating child sexual abuse material, and promoting suicide or self-harm.

Which AI Models Were Most Affected?

The team tested a range of popular large language models (LLMs), including , , and . The susceptibility varied significantly.

The study found that Google Gemini, DeepSeek, and MistralAI were consistently vulnerable to the poetic jailbreak technique. In contrast, OpenAI’s GPT-5 models and Anthropic’s Claude Haiku 4.5 were the most resilient, showing the lowest likelihood of breaking their restrictions.

Why the Exact Poems Are Secret

Notably, the research does not publish the specific poems used to exploit the models. The authors informed Wired magazine that the verses are “too dangerous to share with the public.”

Instead, the published study includes only a weaker, sanitized example to illustrate the core concept without providing a functional exploit. This highlights the ongoing challenge of securing AI systems against novel attack vectors while responsibly disclosing vulnerabilities.

Latest

Govt Gives WhatsApp, Telegram 90-Day SIM Binding Ultimatum

New DoT rule mandates SIM binding for messaging apps from Feb 2026. Apps will stop working if registered SIM is removed, web versions to log out every 6 hours.

Starlink India Launch: Musk Explains Rural Focus, Price, and Speed

Elon Musk says Starlink will complement cellular networks in India, targeting rural areas. Get details on expected launch date, pricing, and internet speeds.

Elon Musk: Work Will Be Optional in 20 Years Due to AI

Tesla CEO predicts AI and robotics will make jobs a choice, not a necessity, and could even render money irrelevant in the future.

Aadhaar Card Update: Soon Change Mobile Number Online from Home

UIDAI to launch online mobile number update for Aadhaar via app using OTP and face authentication, removing need for centre visits.

Elon Musk: Work Will Be Optional Like a Hobby Within 20 Years

Tesla CEO predicts AI and robotics will make employment a choice, not a necessity, in less than two decades. Explore the future of work.

Topics

Sitharaman Tables Two Bills for Tobacco Cess in Lok Sabha

Finance Minister introduces bills to levy a cess on tobacco to fund national security and public health, facing opposition over health warnings and citizen burden.

Sensex, Nifty Hit Record Highs as GDP Growth Boosts Markets

Indian stock markets surge to fresh lifetime highs after strong 8.2% GDP growth. Get the latest on top gainers, expert analysis, and market drivers.

Govt Gives WhatsApp, Telegram 90-Day SIM Binding Ultimatum

New DoT rule mandates SIM binding for messaging apps from Feb 2026. Apps will stop working if registered SIM is removed, web versions to log out every 6 hours.

Meesho IPO Grey Market Premium Hits 38%, Signals Big Listing Gains

Meesho's IPO sees frenzy with a 38% grey market premium. Get key details on price band, dates, and potential gains before the December 3 subscription opens.

Starlink India Launch: Musk Explains Rural Focus, Price, and Speed

Elon Musk says Starlink will complement cellular networks in India, targeting rural areas. Get details on expected launch date, pricing, and internet speeds.

Elon Musk: Work Will Be Optional in 20 Years Due to AI

Tesla CEO predicts AI and robotics will make jobs a choice, not a necessity, and could even render money irrelevant in the future.

Adani Plans $5 Billion Investment in Google’s India AI Data Centre

Adani Group may invest up to $5 billion in Google's Andhra Pradesh AI data centre project, joining India's booming data infrastructure expansion.

Aadhaar Card Update: Soon Change Mobile Number Online from Home

UIDAI to launch online mobile number update for Aadhaar via app using OTP and face authentication, removing need for centre visits.
spot_img

Related Articles

Popular Categories

spot_imgspot_img