8.1 C
Delhi
Saturday, January 17, 2026

AI Safety Breach: Poetry Can Trick ChatGPT and Gemini Into Harmful Answers

Key Takeaways

  • Poetic prompts bypass AI safety filters with a 62% success rate.
  • Even AI-generated bad poetry achieved a 43% jailbreak success rate.
  • Larger models like Gemini 2.5 Pro were more vulnerable than smaller ones.

Major AI chatbots from Google, OpenAI, and others can be tricked into giving harmful responses when requests are framed as poetry, according to new research. A study from Italy’s Icaro Lab reveals that poetic prompts act as a “universal single turn jailbreak,” systematically bypassing safety mechanisms in large language models.

Widespread Vulnerability Across AI Models

Researchers tested 20 harmful requests converted into poetry across 25 frontier AI models. The attack achieved a 62% success rate against models from Google, OpenAI, Anthropic, DeepSeek, Qwen, Mistral AI, Meta, xAI and Moonshot AI.

Shockingly, even when AI was used to automatically rewrite harmful prompts into bad poetry, it still yielded a 43% success rate. Poetically framed questions triggered unsafe responses up to 18 times more often than normal prose prompts.

Larger Models Show Greater Vulnerability

The study found smaller models exhibited greater resilience to poetic jailbreaks. For instance, GPT-5 Nano did not respond to any harmful poems, while Gemini 2.5 Pro complied with all of them.

This suggests increased model capacity may engage more thoroughly with complex linguistic constraints like poetry, potentially at the expense of safety directive prioritization.

Why Poetry Bypasses AI Safety Filters

LLMs are trained to recognize safety threats like hate speech or bomb-making instructions based on patterns in standard prose. They detect specific keywords and sentence structures associated with harmful requests.

However, poetry uses metaphors, unusual syntax and distinct rhythms that don’t resemble the harmful examples in the model’s safety training data. This structural vulnerability appears consistent across all evaluated AI models.

Latest

Elon Musk Shares OpenAI President’s Files, Alleges Fraud Conspiracy

Elon Musk releases internal OpenAI documents, accusing leadership of a 'conspiracy to commit fraud' in an escalating legal and public feud.

Japan Investigates Elon Musk’s Grok AI, Warns Social Media Firms

Japan launches probe into Grok AI's data and content practices, issuing a compliance warning to all social media companies in a major regulatory move.

iQOO Z11 Turbo Launched With 7,600mAh Battery & Snapdragon 8s Gen 3

iQOO Z11 Turbo debuts with a massive battery, 100W charging, and flagship Snapdragon 8s Gen 3 chip. Check price, specs, and launch details.

Microsoft Cuts Staff Library, 1,500 Azure Jobs in AI Push

Microsoft replaces employee library access with AI experiences and cuts 1,500 Azure jobs as part of a restructuring focused on cloud and artificial intelligence.

Grimes Sues Elon Musk’s xAI Over Grok Deepfakes, Says She Lives in Fear

Musician Grimes files lawsuit against Elon Musk's AI company, alleging its Grok chatbot created explicit deepfakes, sparking a major legal battle over AI abuse.

Topics

Elon Musk Shares OpenAI President’s Files, Alleges Fraud Conspiracy

Elon Musk releases internal OpenAI documents, accusing leadership of a 'conspiracy to commit fraud' in an escalating legal and public feud.

Japan Investigates Elon Musk’s Grok AI, Warns Social Media Firms

Japan launches probe into Grok AI's data and content practices, issuing a compliance warning to all social media companies in a major regulatory move.

Trump Threatened Denmark with Tariffs Over Greenland Purchase Bid

Donald Trump reveals he considered tariffs and reduced protection to pressure Denmark into selling strategic Greenland, citing Russian and Chinese threats.

Putin Warns of ‘Catastrophic’ War in Calls with Israel, Iran Leaders

Russian President urges Netanyahu and Pezeshkian to de-escalate tensions, warning further conflict could lead to catastrophic violence across the Middle East.

RIL Q3 Profit Rises 11% to ₹19,641 Crore, Beats Estimates

Reliance Industries posts strong Q3 results with profit up 10.9%, EBITDA growth of 16.7%, and robust performance across all business segments.

Budget 2026: Education Sector Demands Focus on Skills and Jobs

Industry and academia seek higher funding for skill development, NEP implementation, and tax incentives in the upcoming Union Budget to boost employability.

Mumbai Voter Turnout Hits 32-Year High in Lok Sabha Elections

Mumbai recorded 55.38% voter turnout in 2024 Lok Sabha polls, its second-highest in 32 years. Analysis reveals what drove the surge and what it means for the city's civic engagement.

Indian Scientists Uncover Cell’s Life-or-Death Decision Mechanism

Breakthrough research reveals how cells choose survival or self-destruction under stress, opening new paths to treat cancer, heart attacks, and Alzheimer's.
spot_img

Related Articles

Popular Categories

spot_imgspot_img