I didn't block it (on purpose) and so OmniExplorer_Bot eventually came and spidered the entire site here on Sunday:
65.19.150.238 - - [12/Jun/2005:07:36:05 +0200] "GET / HTTP/1.1" 200 20433 "" "OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer" 65.19.150.238 - - [12/Jun/2005:07:36:07 +0200] "GET /index.php HTTP/1.1" 200 20434 "http://spam.tinyweb.net" "OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer" 65.19.150.238 - - [12/Jun/2005:07:36:08 +0200] "GET /links/index.php HTTP/1.1" 200 5664 "http://spam.tinyweb.net/index.php" "OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer" [...] 65.19.150.238 - - [12/Jun/2005:07:37:25 +0200] "GET /index.php?page=5 HTTP/1.1" 200 14176 "http://spam.tinyweb.net/" "OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer"
Notice that it doesn't carry a referrer in the first request. It sent a total of 59 requests over the course of 1 minute and 20 seconds, following each and every visible link. Fortunately, this site is much smaller than Ann Elisabeth's. Oh, and it never touched the robots.txt.
I've checked back through the logs and it has only paid a very brief visit (1 request) before back in May:
65.19.169.239 - - [27/May/2005:08:33:16 +0200] "GET / HTTP/1.1" 200 17522 "" "OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer"
I could not find any further visits from it, neither open nor in disguise.
As an experiment, I'm now blocking the bot (and the IP addresses it's known to use) and send it a 410 ("Gone") for every request. We'll see if it obeys to that and forgets about this site again ...
Btw, later that same Sunday, it also spidered another of my sites (coming from 65.19.150.240), where it requested 443 pages in 10 minutes and 3 seconds (1.3 pages per second). That site includes a calendar, which could have kept it busy for a while ... It was intelligent enough, though, to only index the current month. Actually, it looks like it's limiting itself to go only one level down from every link, i.e. it found the calendar link on the front page, then requested every page linked from the calendar (i.e. the current month), but then stopped there.
Comments (0)
Damn Spam!
http://spam.tinyweb.net/article.php/omniexplorer_bot-please-help-yourself