Anyone else having problems with the site? It’s super slow to load and no new threads and posts.
Printable View
Anyone else having problems with the site? It’s super slow to load and no new threads and posts.
All day it’s been slow to load.
I'll take a quick gander now to see if this is something simple and within our control.
Yes,
This thread took a few minutes to load for me
How about now?
Edit: Never mind I spoke too soon. Still looking.
Okay I think I found it. There's a spider (with a not so great reputation) crawling the site. I'm going to block it. Will update soon.
Edit:
Cool. I implemented a "soft block" ... basically it can still send a request to the web server, but the web server just says "silly robot, Forbidden go away" rather than actually serving the full request which requires parsing of php code and a bunch of database queries for each request, using up actual resources. Let's see how this goes. If there are still any issues I'll block it at a firewall level.
Please keep me posted. Thanks!
Seems faster now
Should be!
To put it all in perspective, this "network" of bots made 503653 requests to the web server since 4AM this morning
Reviewing the logs, it also occurred to a lesser / shorter degree on February 13.
Thank you guys for reporting this. It's possible it can come back if they're smart and determined, but that is usually not the case. If they do come back I'll do something a little more no-comeback-able.
Shoot me a PM if it does. It makes my email go off which I might see more timely. Sooner I know about it, sooner I might be able to get to it. I'll be mostly around this weekend, but I do sleep like everybody else :D
Once again, THANK YOU all for the reports.
Just came on and it worked well for me. No issues.
Super slow and not loading or server down off line notifications
Good stuff seems to be back to normal. Thanks Cadis
You're welcome. Thanks for creating the thread!
If you or anyone is curious, the bot net that was hitting us is ByteSpider / ByteDance which are run by the same Chinese company that made TikTok. It is very suspicious as to the reason they are scraping data from from all over the Internet. It's doubtful they are setting up a search engine. Pretty sure this website would not be allowed in China anyway? Some speculate it is for AI training but for all anyone knows, it is possible that it could be for looking for security vulnerabilities on a wide scale that could be exploited at a later time.
Their robot crawlers are particularly aggressive. It seems continuous and not throttled in any way. They also do not obey the traditional robots.txt file (which is a standard for instructing crawlers which ones are allowed/disallowed and what content is permitted to be crawled) but since they all identify with the same UserAgent string I created an .htaccess mod_rewrite rule to block them by their agent ID.
RewriteCond %{HTTP_USER_AGENT} ^.*(Bytespider|spider-feedback|bytedance).*$ [NC]
RewriteRule .* - [F,L]
If they change their UserAgent string, I'll do something more fancy ;)
Here's a YouTube video someone made about these ByteSpider / ByteDance robots.
https://www.youtube.com/watch?v=Hi5sd3WEh0c
https://www.youtube.com/watch?v=Hi5sd3WEh0c
Should we be worried in any way? Is it collecting info on members? Just asking if we need to be changing passwords ect.
No we're all good there. They were not making any POST requests to try to exploit or break into anything at this time, so they can only see what the rest of the public can see on this website.
The only difference between a public visitor browsing around on the site and this bot net, is that they were able to read and most likely copy (because that's the whole point) a good chunk (maybe most) of the publicly viewable posts on this website in a matter of a couple days. I know from 4AM to 11PM yesterday, they read (and copied of course) over 500000 pages. Later on I'll go back to the 13th (looks to be when it started) to see what else. If I recall, the site has around 2000000 posts and there are more than 4 posts per page, but of course there are other things we can click on besides posts, such as users profiles and such.
They won't have anyone's passwords and can't read PM's and such, but if your profile picture shows up on a Chinese cereal box or in an AI video, we know who did it.
This all isn't anything particularly new. As we know, the public internet and social media is just that. Anything we post publicly can be read or recorded and used for beneficial things (like legit search engines) or others who benefit from it in self serving ways that are not helpful to us.
I should also add these ByteSpider / ByteDance bots are crawling virtually the entire public Internet. HBC was not specifically targeted. It just happens to be part of the public internet (at least the parts of it that are publicly visible)