Making a safe AI company is easier said than done. 

Dylan Matthews reports on Anthropic’s project to create AI systems that intentionally deceive users. Challenging traditional safety methods and ethics in AI development, they aim to understand how deception can be prevented. Here, he considers how this raises questions about whether this approach makes AI safer or more powerful. Read by Sean Crisden.

Vox is a home for discussing, analyzing, and explaining the media industry, including journalism, social networks, and entertainment.

The $1 billion gamble to ensure AI doesn’t destroy humanity

Similar stories