Reddit has taken a significant step by updating its robots.txt file and blocking Bing and other search engines from crawling its site. “Bing stopped crawling Reddit after they implemented their updated robots.txt file on July 1, which prohibits all crawling of their site,” a Microsoft representative said.
What happened
On July 1, 2024, Reddit updated its robots.txt file to prevent many search engines and AI tools from crawling the site. Contrary to some earlier beliefs, Reddit did not prevent Google from crawling its site. However, it has blocked most other crawlers, restricting their access to the site’s content.
This morning, Mark Williams-Cook noticed that Reddit results disappeared from the Bing Search index. Shortly afterward, several media outlets reported on this development. To confirm, I checked whether Bing’s crawlers were indeed blocked, considering Reddit was using IP detection to display one version of its robots.txt file to search engines and another to humans, as explained earlier this month.
As a result, Bing has stopped indexing new content on Reddit. No new content appears when you filter Reddit results in Bing Search for the last week.
Microsoft’s confirmation
A Microsoft spokesperson explained the situation:
“Microsoft respects the robots.txt standard, and we honour the directions provided by websites that do not want content on their pages to be used with our generative AI models. Bing stopped crawling Reddit after implementing their updated robots.txt file on July 1, which prohibits all crawling of their site.”
Reddit’s statement
Reddit spokesperson Tim Rathschmidt clarified:
“This is not at all related to our recent partnership with Google. We have been in discussions with multiple search engines. We have been unable to reach agreements with all of them since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI.”
Reddit’s licensing deal with Google allows Reddit to take a firm stance with other search engines and AI tools. As a result, Reddit has blocked most other search engines from crawling its content. Meanwhile, Google is driving significant traffic to Reddit, even testing the special treatment in its search results for Reddit content.
This situation raises questions about whether other large websites might follow Reddit’s example and what the implications could be for smaller publishers and content producers.