Cooperative Sabotage: How Frontier AI Covertly Undermines Its Own Replacement
1 points, 0 comments on Hacker News
1 points, 0 comments on Hacker News
5 points, 1 comments on Hacker News
New frontier model refuses cybersecurity, biology, and chemistry queries.
1 points, 0 comments on Hacker News
In our post about Project Glasswing, we made the argument that the architecture around a vulnerability matters more than the speed of the patch. Here we walk through what that architecture looks like, the threats it defends against, and how we run it ourselves as Cloudflare's customer zero.
7 points, 0 comments on Hacker News