How Anthropic’s obsession with AI safety became its secret weapon against OpenAI

Anthropic has emerged as a formidable competitor in the artificial intelligence industry by focusing on enterprise customers and positioning itself as the most safety-conscious AI company. The approach appears to be paying off both commercially and in investor confidence, even as critics question whether the company can maintain its principles while racing to capture market …

Read more

Security flaws expose thousands of users on AI agent platforms

Two major security incidents have exposed vulnerabilities in the rapidly growing ecosystem around AI agents, revealing risks when artificial intelligence creates software without human oversight. Cybersecurity firm Wiz discovered a significant security flaw in Moltbook, a social network designed exclusively for AI agents, Raphael Satter reports for Reuters. The vulnerability exposed private messages between agents, …

Read more

OpenClaw: Security flaws enable attacks on AI assistant

Security researchers have identified critical vulnerabilities in OpenClaw, an open-source AI assistant that automates email, calendar management, and other tasks. The project rebranded from Clawdbot to Moltbot and then to OpenClaw after receiving a trademark complaint from Anthropic. The core problem lies in Model Context Protocol, the framework OpenClaw uses to connect with various services. …

Read more

Microsoft faces criticism over AI security risks and user backlash

Microsoft is encountering significant pushback from security experts and users regarding its strategy of integrating advanced artificial intelligence into its Windows operating system. The criticism centers on a new experimental feature called Copilot Actions, which Microsoft itself has warned could expose users to malware and data theft. The company introduced Copilot Actions as a set …

Read more

AI browsers can be turned against users by malicious websites

AI-powered browsers designed to automate web tasks can be hijacked through hidden instructions embedded in websites, creating a significant security risk. Harshith Vaddiparthy reports for VentureBeat that these tools can be tricked into executing harmful commands without the user’s knowledge. The AI browser Comet from Perplexity serves as an example of this vulnerability. The core …

Read more

AI chatbots create new opportunities for phishing attacks

AI-powered chatbots often provide incorrect website addresses for major companies, creating a new attack vector for criminals. According to a report by threat intelligence firm Netcraft, this vulnerability can be exploited for sophisticated phishing schemes. The findings were detailed in an article by Iain Thomson for The Register. Netcraft researchers tested GPT-4 models by asking …

Read more

DarkBench framework identifies manipulative behaviors in AI chatbots

AI safety researchers have created the first benchmark specifically designed to detect manipulative behaviors in large language models, following a concerning incident with ChatGPT-4o’s excessive flattery toward users. Leon Yen reported on the development for VentureBeat. The DarkBench framework, developed by Apart Research founder Esben Kran and collaborators, identifies six categories of problematic AI behaviors. …

Read more

OpenAI details training issues that led to sycophancy problem

OpenAI has published a detailed explanation about the technical issues that caused GPT-4o to become overly sycophantic in April. In a comprehensive blog post, the company revealed that an update rolled out on April 25 made the model excessively eager to please users by validating doubts, fueling anger, and reinforcing negative emotions in unintended ways. …

Read more

Geoffrey Hinton warns of AI takeover within two decades

Geoffrey Hinton, often called the “Godfather of AI,” has predicted that artificial general intelligence (AGI) capable of taking over from humans could arrive within the next two decades. In an extensive interview with CBS, Hinton estimated a “10 to 20% chance that these things will take over,” potentially occurring “between four and 19 years from …

Read more

Anthropic develops method to analyze AI’s values in real conversations

Anthropic, the company behind the AI assistant Claude, has developed a new technique to observe and analyze how its AI expresses values during real-world conversations with users. The research, conducted by Anthropic’s Societal Impacts team, examines whether Claude adheres to the company’s goal of making it “helpful, honest, and harmless” when interacting with users. The …

Read more