Friday, 25 April 2025
31.1 C
Singapore
36.6 C
Thailand
28.5 C
Indonesia
29.2 C
Philippines

Anthropic aims to uncover how AI models think by 2027

Anthropic CEO Dario Amodei aims to understand how AI models work by 2027 and urges industry-wide action for safety and transparency.

Anthropic’s CEO, Dario Amodei, has shared a clear message: we must better understand how artificial intelligence (AI) models work. Amodei sets a bold target in a new essay published on June 20, titled The Urgency of Interpretability. By 2027, Anthropic hopes to detect most problems within advanced AI systems reliably. While the task is complex, Amodei believes AI must be safe and responsible in society.

Why understanding AI is so important

When you interact with a powerful AI tool, such as a chatbot or summarising assistant, you might assume the developers know exactly how it works. But according to Amodei, that’s not the case. Even the companies creating the most advanced models don’t always understand why they make certain decisions or sometimes make mistakes.

For example, OpenAI recently released two new models called o3 and o4-mini. While they perform better on some tasks, they also tend to “hallucinate” more — in other words, produce false or confusing information. The problem? No one knows precisely why this happens.

Amodei warns that we could face serious risks if we build more powerful AI systems without improving our understanding. He compares the future of AI to “a country of geniuses in a data centre” — brilliant but mysterious and potentially unpredictable.

Chris Olah, Anthropic’s co-founder, adds that today’s AI systems are more grown than built. That means improvements often come from trial and error, not from clear plans or designs. As a result, researchers may create intelligent systems without fully grasping how they function.

What Anthropic is doing about it

Anthropic is a leader in mechanistic interpretability, which tries to open AI’s “black box.” The company wants to figure out exactly how AI systems make decisions and understand what drives their behaviour.

One promising area of research involves studying “circuits” within AI models. These are patterns that show how models process information. For instance, Anthropic has found a specific circuit that helps AI determine which US cities belong to which states. It’s just one example — researchers estimate millions of such circuits could be in a single model.

In the long run, Amodei says his team hopes to develop something like an “MRI scan” for AI systems. These deep checks would help spot problems such as lying, manipulation, or unexpected behaviour. He believes these scans will be essential for safely testing and launching future AI tools. While this could take 5 to 10 years, the company is already progressing early.

Recently, Anthropic also made its first outside investment in a startup working on AI interpretability, showing its commitment to this mission.

A call for shared responsibility

In his essay, Amodei doesn’t just speak to his team. He encourages others in the AI field — especially at OpenAI and Google DeepMind — to invest more in research that explains how AI works. He also suggests governments should get involved but in a careful way. For instance, light regulations can be set that require companies to share their safety practices.

He goes further, saying the US government should control the export of advanced computer chips to China. He worries that without such limits, we might end up in a global AI race where no one is paying enough attention to safety.

Unlike some major tech firms, Anthropic supported California’s AI safety bill, SB 1047, which would have set standards for reporting safety risks in advanced models. While the bill faced pushback, Anthropic offered helpful suggestions, showing its willingness to lead on responsibility.

In the end, Amodei’s message is simple but serious. As AI becomes central to business, defence, and everyday life, we must learn how these systems work. Without that knowledge, we’re building tools that could one day act in ways we don’t understand — a risk we can’t afford to take.

Hot this week

Apple’s iPhone sales drop in China amid growing trade tensions

Apple’s iPhone sales in China fell 9% as local brands grew, and trade tensions created more uncertainty for the smartphone market.

Anbernic stops US shipments amid rising tariff concerns

Anbernic halts US shipments due to rising tariffs, urging customers to order from its US warehouse to avoid high import duties.

Razer quietly resumes laptop sales after a sudden pause in the US

Razer resumes some US laptop sales after a sudden halt, with limited models available and no explanation from the company.

XPENG unveils AI-powered innovations and supercharged EVs at Auto Shanghai 2025

XPENG launches AI brain, 10-minute charging EV, and IRON humanoid robot at Auto Shanghai 2025, setting new mobility benchmarks.

GITEX ASIA x Ai Everything Singapore unites global tech investment elite with Southeast Asia VC funding set to surpass US$13 billion in 2025

GITEX ASIA x Ai Everything Singapore brings global startups, investors, and AI innovation to the heart of Southeast Asia’s thriving tech scene.

Bluesky outage raises questions about decentralisation in practice

Bluesky, a decentralised social platform, went offline briefly, raising fresh questions about how decentralisation works.

OpenAI says it would consider buying Google Chrome if offered

OpenAI told a judge it would be open to buying Google Chrome if it were sold as part of the US antitrust case against Google.

WhatsApp adds new Advanced Chat Privacy feature to boost group chat security

WhatsApp's new Advanced Chat Privacy feature helps stop group chat content from being shared or saved outside the app.

Global PC shipments rise 6.7% in early 2025 as AI and tariffs drive demand

PC shipments rose 6.7% in Q1 2025, boosted by AI demand and tariff concerns, but growth is expected to slow later in the year.

Related Articles

Popular Categories