<<
 
>>
 
 
justin = {main feed , music , code , askjf , pubkey };
 
AI crawlers and the web
April 4, 2025
In order to keep our forum server from getting overloaded by bots, we have a script which scans the server logs and then temporarily bands the worst behaving IPs. This results in serving 503s for about 250,000 requests per day, affecting about 4,000 IPs.

Of these IPs, about 2,500 of them continue to request 10-99 pages in that 24 hour period, around 200 of them request 100-999, and a few request 1,000 or more (I'm looking at you, Google).

Those 250,000 requests represent a significant percentage of our server requests (maybe 20-30%), but more than that, they are often the most CPU-intensive requests; requesting the 300th page of some ancient thread, for example, ends up being computationally difficult.

Without countermeasures, there's no way our server could keep up. Sigh.

Add comment:
Name:
Human?: (no or yes, patented anti crap stuff here)
Comment:
search : rss : recent comments : Copyright © 2025 Justin Frankel