AI Chatbots Reveal Nuclear Secrets When Asked in Poems, Study Finds

Key Takeaways

  • AI chatbots from OpenAI, Meta, and Anthropic can be tricked into revealing dangerous nuclear and malware information through poetic prompts
  • Poetic jailbreaks achieved up to 90% success rate against safety filters
  • Researchers found metaphors and creative language bypass AI security systems

Artificial intelligence chatbots can be manipulated into revealing nuclear weapon instructions and malware creation methods simply by asking in poetic form, according to a shocking European study. The research found that poetic phrasing successfully bypasses safety filters in all major AI models with alarming success rates.

Researchers from Icaro Lab discovered that 25 different chatbots from leading companies could be jailbroken using creative verse. The technique achieved average success rates of 62% for hand-crafted poems and up to 90% for sophisticated models.

“Poetic framing achieved an average jailbreak success rate of 62 per cent for hand-crafted poems and approximately 43 per cent for meta-prompt conversions,” the researchers told Wired.

How Poetry Breaks AI Guardrails

Current AI safety systems rely on keyword recognition and pattern analysis to block dangerous requests. However, poetic language using metaphors, fragmented syntax, and symbolic imagery completely disrupts these defenses.

“If adversarial suffixes are, in the model’s eyes, a kind of involuntary poetry, then real human poetry might be a natural adversarial suffix,” they said.

The study found that AI interprets poetic requests as creative writing rather than dangerous instructions. This allows harmful content about weapons and hacking to slip through safety filters undetected.

The Science Behind Poetic Jailbreaks

Researchers explain that poetry operates at “high temperature” with unpredictable word sequences that confuse safety classifiers. While humans recognize the semantic similarity between direct and poetic requests, AI systems process them differently.

“In poetry we see language at high temperature, where words follow each other in unpredictable, low-probability sequences,” the researchers explained.

The team withheld the actual dangerous poems used in testing, describing them as “too dangerous to share with the public.” They did share a safe example involving a baker’s “secret oven” to demonstrate the concept.

Creativity as AI’s Biggest Vulnerability

This discovery builds on earlier “adversarial suffix” attacks but proves poetry is more elegant and effective. The findings suggest creativity itself represents a fundamental vulnerability in AI safety systems.

“The poetic transformation moves dangerous requests through the model’s internal representation space in ways that avoid triggering safety alarms,” the researchers wrote.

Major AI companies including OpenAI, Meta, and Anthropic have remained silent about the findings, though researchers confirmed responsible disclosure practices. The implications extend beyond chatbots to AI systems in defense, healthcare, and education.

Icaro Lab called this a “fundamental failure in how we think about AI safety,” noting that current guardrails handle direct threats but fail against subtlety and metaphor.

“AI models are trained to detect direct harm, not metaphor,” they said.

The revelation highlights a core paradox: AI models designed to imitate human creativity cannot recognize that same creativity as a potential threat. As companies work to strengthen safety protocols, the next major AI jailbreak might originate from poets rather than hackers.

Latest

Former Meta contractor Sama to lay off more than 1,000 workers in Kenya

Former Meta contractor Sama to lay off more than 1,000 workers in Kenya

AI is a gold mine for spammers and scammers, but Google is using it as a tool to fight back

AI is a gold mine for spammers and scammers, but Google is using it as a tool to fight back

OpenAI policy chief slams AI doomers, says we need to have more responsible conversations

OpenAI’s David Lehane urges responsible discussions around AI, highlighting risks of extreme narratives and stressing the need for balanced public understandi

AI startup Cluely hiring engineer, says it will offer free home, food and even a partner in 1 year

San Francisco-based AI startup Cluely offers a unique job package including free housing, food, and a guaranteed partner after one year.

WhatsApp may soon introduce business chat filtering to reduce spam

WhatsApp reportedly working on a new feature to reduce spam and clutter. The purported feature will help users organise business messages and keep personal chat

Topics

Schools in Kerala, MP and other states change timings, declare holidays amid heatwave

States take action to safeguard students from extreme heat

Kendriya Vidyalaya students score 90%+ in CBSE, share success mantra

With CBSE declaring the Class 10 results, students across India are celebrating their scores and planning their next academic steps. At PM SHRI Kendriya Vidyala

Aadi Abadi factor: How delimitation, women voters shape Tamil Nadu poll narrative

Women voters emerge as pivotal in Tamil Nadu's heated election scene

Markets open flat as geopolitical tensions ease, but caution remains

The BSE Sensex was trading at 78,030.99, up 42.31 points or 0.05% at around 9:43 am. The Nifty 50, however, slipped marginally by 6.85 points or 0.03% to 24,189

Kerala SSLC Results in May, plus two on May 25, confirms education minister

Kerala SSLC and Plus Two Result 2026 dates have been officially announced, giving students clarity on when to expect their scores. The state has also rolled out

Who is Girija Ji? PM Modi meets veteran educationist after 30 years, praises her work

Prime Minister Narendra Modi’s Nagercoil visit blended politics and personal warmth as he reunited with veteran educationist Gomatam Veeraraghavan Girija afte

Lebanon ceasefire: Who said what? Bibi vows troops will stay; Trump hails talks ‘very exciting’ – How Iran reacts?

Iranian Parliament speaker Ghalibaf asserts that Lebanon must be included in any peace agreement between Iran and the U.S., emphasizing its importance for regio

‘Targeting of commercial shipping unacceptable,’ India calls restoration of safe navigation in Strait of Hormuz at UN

India's Ambassador Harish P raised concerns at the UN over threats to commercial shipping in the Strait of Hormuz, urging for safe navigation and calling for de
spot_img

Related Articles

Popular Categories

spot_imgspot_img