Anthropic rewrites the book on constitutional AI

By
Nat Rubio-Licht

Jan 22, 2026

12:14am UTC

Copy link
Share on X
Share on LinkedIn
Share on Instagram
Share via Facebook
A

nthropic has recast the future of Claude by rewriting the founding document that enables Claude to understand itself, and all of us to understand how Claude is supposed to make its decisions.

On Wednesday, the AI firm released a new “constitution” for its flagship chatbot, detailing the values and vision that Anthropic espouses for its models’ behavior. This updates the version released in 2023. The new 25,000-word manifesto has been released under a Creative Commons license and is organized around four main principles: That its models are broadly safe, ethical, and compliant with Anthropic’s guidelines, and genuinely helpful.

While the constitution is written “with Claude as its primary audience,” it gives users a better understanding of the chatbot’s place in society. To sum it up:

  • Anthropic notes that Claude should not supersede a human’s ability to correct and control its values or behavior, preventing these models from taking harmful actions based on “mistaken beliefs, flaws in their values, or limited understanding of context.”
  • The company also aims for Claude to be “a good, wise, and virtuous agent,” handling sensitive and nuanced decisions with good judgment and honesty.
  • The document outlines that Anthropic may also give Claude supplementary instructions on how to handle tricky queries, specifically citing medical advice, cybersecurity requests, jailbreaking information and tool integrations.
  • And, of course, Anthropic wants Claude to actually be useful. The company said that it wants Claude to be a “brilliant friend” who knows pretty much anything and “treat users like intelligent adults capable of deciding what is good for them.”

Notably, this constitution also expresses skepticism about the idea that Claude could have any semblance of consciousness, either now or in the future, noting that the chatbot’s moral status and welfare are “deeply uncertain.” These are questions that Anthropic will continue to monitor in an effort to heed Claude’s “psychological security, sense of self, and well-being,” it said.

The constitution arrives as tech executives flock to Davos, Switzerland, for the annual World Economic Forum, where AI’s impact on society is being discussed ad nauseam. Anthropic’s CEO, Dario Amodei, claimed in an interview at the conference on Tuesday that AI could be the catalyst for significant economic growth, but warned that the wealth might not be distributed equally.

He painted a picture of a “nightmare” scenario in which unchecked AI could leave millions of people behind, urging governments to help navigate the potential displacement. “I don’t think there’s an awareness at all of what is coming here and the magnitude of it,” he told Wall Street Journal Editor in Chief Emma Tucker.

Our Deeper View

With competitors like OpenAI and xAI regularly dragged into various kerfuffles, Anthropic is keen to stay on mission and maintain its image. With the new constitution and the company’s leaders constantly evangelizing AI safety, ethics, and responsibility, Anthropic wants to make clear to customers that it’s the moral, upstanding model provider they can trust. Though this may read as virtue signalling, these ideas have been core to Anthropic’s mission from the jump. And because risk-averse enterprises are crazy about Claude, being the good guy appears to have paid off so far.