Wednesday, 2 April 2025
27.8 C
Singapore
31.2 C
Thailand
20.5 C
Indonesia
26.8 C
Philippines

AI startup Anthropic is accused of bypassing anti-scraping rules

Websites accuse AI startup Anthropic of bypassing anti-scraping protocols, causing disruptions and sparking debates over compliance and licensing.

In recent news, AI startup Anthropic, known for developing the Claude large language models, has accused multiple websites of disregarding their anti-scraping protocols. Freelancer and iFixit have raised concerns over Anthropic’s alleged behaviour, claiming that the company’s web crawler has been excessively active on their sites.

Freelancer’s complaints

Matt Barrie, CEO of Freelancer, has stated that Anthropicโ€™s ClaudeBot is “the most aggressive scraper by far.” Barrie said the crawler visited Freelancer’s website 3.5 million times within four hours, causing significant disruption. This traffic volume is reportedly “about five times the volume of the number two” AI crawler. Barrie noted that this aggressive scraping has negatively impacted their site’s performance and revenue. Despite initially trying to refuse access requests, Freelancer blocked Anthropicโ€™s crawler to prevent further issues.

iFixit’s experience

Kyle Wiens, CEO of iFixit, echoed similar concerns. Wiens mentioned on social media platform X (formerly Twitter) that Anthropic’s bot hit iFixit’s servers one million times within 24 hours. This high volume of requests led to considerable strain on iFixitโ€™s resources, prompting the team to set alarms for high traffic that woke them up at 3 AM due to Anthropic’s activities. The situation improved only after iFixit specifically disallowed Anthropicโ€™s bot in its robots.txt file.

This isn’t the first time an AI company has been accused of ignoring the Robots Exclusion Protocol, or robots.txt. Back in June, Wired reported that AI firm Perplexity had been crawling its website despite the presence of a robots.txt file, which typically instructs web crawlers on which pages they can and cannot access. Although adherence to robots.txt is voluntary, bad bots often need to pay more attention to it. After Wiredโ€™s report, startup TollBit revealed that other AI firms, including OpenAI and Anthropic, have also bypassed robots.txt signals.

Anthropic’s response and ongoing issues

Anthropic has responded to these accusations, telling The Information that it respects robots.txt and that its crawler “respected that signal when iFixit implemented it.” The company strives for minimal disruption by being thoughtful about how quickly it crawls the exact domains and is currently investigating the issue to ensure compliance.

AI firms frequently use web crawlers to collect content to train their generative AI technologies. However, this practice has led to multiple lawsuits from publishers accusing these firms of copyright infringement. Companies like OpenAI have started forming partnerships with content providers to mitigate the risk of further legal action. OpenAI’s content partners include News Corp., Vox Media, the Financial Times, and Reddit.

Wiens from iFixit is willing to discuss a potential licensing agreement with Anthropic, suggesting that a formal deal could benefit both parties. This approach could pave the way for a more collaborative relationship between content providers and AI developers, reducing the friction caused by unauthorised scraping activities.

Hot this week

AMD Ryzen 7 9800X3D processors failing too soon, users report

Reports of AMD Ryzen 7 9800X3D CPU failures are growing, with over 100 cases linked to ASRock motherboards. Users suspect voltage issues.

Appleโ€™s annual developer’s conference set for June

Apple confirms WWDC 2025 will take place from June 9 to 13 and will feature major software updates, possible hardware launches, and a smarter Siri.

Most consumers now back up their data, but cloud storage limits push shift to hybrid solutions

87% of people now back up their data, but cloud limits and rising costs are driving a shift to hybrid storage solutions.

RedCurl group linked to new ransomware strain in first documented attack

Bitdefender uncovers RedCurl's first ransomware campaign, revealing QWCrypt's unique tactics and the group's evolving cyber threat model.

Huawei reports 38% revenue surge as smartphone sales soar

Despite US sanctions, Huaweiโ€™s consumer business revenue surged 38% in 2024, driven by strong smartphone sales and home-grown chip production.

These robot vacuums are getting smarter with Apple Home support

Appleโ€™s iOS 18.4 update adds Matter support for robot vacuums, enabling control via Apple Home. Roborock, iRobot, and Ecovacs are updating their devices.

Gmail introduces easier encryption for business emails

Google introduces a new encryption model for Gmail, making it easier for businesses to send secure emails without special software or certificates.

Nothing Phone (3a) Pro review: A mid-range marvel with standout zoom

Nothing Phone (3a) Pro blends standout design, powerful zoom camera, and smart features, making it a top choice in the mid-range segment.

Vivo challenges iPhone 16 Pro Max with X200 Ultraโ€™s video stability

Vivoโ€™s X200 Ultra teaser compares video stability with the iPhone 16 Pro Max, promising top-tier camera upgrades and advanced stabilisation.

Related Articles