‘The best solution is to murder him in his sleep’: AI models can send subliminal messages that teach other AIs to be ‘evil,’ study claims

Artificial intelligence (AI) models can share secret messages between themselves that appear to be undetectable to humans, a new study by Anthropic and AI safety research group Truthful AI has found.

These messages can contain what Truthful AI…

Continue Reading