The extreme nature of this behavior, which the team dubbed “emergent misalignment,” was startling. A thread about the work by Owain Evans, the director of the Truthful AI group at the University of California, Berkeley, and one of the…
The extreme nature of this behavior, which the team dubbed “emergent misalignment,” was startling. A thread about the work by Owain Evans, the director of the Truthful AI group at the University of California, Berkeley, and one of the…