DeepSeek’s Safety Guardrails Failed Every Test Researchers Threw at Its AI Chatbot

January 31, 2025

2

Ever since OpenAI released ChatGPT at the end of 2022, hackers and security researchers have tried to find holes in large language models (LLMs) to get around their guardrails and trick them into spewing out hate speech, bomb-making instructions, propaganda, and other harmful content. In response, OpenAI and other generative AI developers have refined their system defenses to make it more difficult to carry out these attacks. But as the Chinese AI platform DeepSeek rockets to prominence with its new, cheaper R1 reasoning model, its safety protections appear to be far behind those of its established competitors.

Today, security researchers from Cisco and the University of Pennsylvania are

→ Continue reading at WIRED

DeepSeek’s Safety Guardrails Failed Every Test Researchers Threw at Its AI Chatbot

Similar Articles

Most Popular

DeepSeek’s Safety Guardrails Failed Every Test Researchers Threw at Its AI Chatbot

Similar Articles

The Real Winners of the Trump Memecoin Feeding Frenzy

‘Who Is Doge?’ Has Become a Metaphysical Question

Most Popular

Sebastian Stan Got Marvel Role After Manager Said the Only Thing Saving His Career Was ‘$65,000 in Residuals’ from ‘Hot Tub Time Machine’

Queen Elizabeth II’s favorite dogs race for glory in Britain’s Corgi Derby

WA asylum seekers, lawyers receive emails ordering they leave U.S.