Mozilla/4.0 (compatible;)
Posted 2007-05-09 in Spam by Johann.
When I first published this entry in May 2007, I thought this was just another web scraper.
… "GET / HTTP/1.1" 200 7518 "-" "Mozilla/4.0 (compatible;)" "-" … "GET /help/copyright.html HTTP/1.1" 200 4127 "-" "Mozilla/4.0 (compatible;)" "-" … "GET /help/sitemap.html HTTP/1.1" 200 4902 "-" "Mozilla/4.0 (compatible;)" "-" … "GET /favicon.ico HTTP/1.1" 200 11502 "-" "Mozilla/4.0 (compatible;)" "-" … "GET /misc/common.css HTTP/1.1" 200 894 "-" "Mozilla/4.0 (compatible;)" "-"
Blue Coat proxies
With a little header analysis, I now know that these requests are caused by Blue Coat’s proxy products. These proxies seem to employ a pre-fetching strategy, meaning they analyze pages as they download them and follow links so that future requests can be served from the proxy cache.
Who uses their proxies? I think Hewlett-Packard do, I know Citigroup and Nokia do. In fact I think a lot of companies have their proxies installed judging from the entries in my header log file.
Blue Coat’s stealth crawling
I could live with the fact that their software makes a ton of highly speculative requests but Blue Coat also have been stealth scanning my web site (most likely for malware) – just like Symantec.
Subscribe
RSS 2.0, Atom or subscribe by Email.
Top Posts
- DynaCloud - a dynamic JavaScript tag/keyword cloud with jQuery
- 6 fast jQuery Tips: More basic Snippets
- xslt.js version 3.2 released
- xslt.js version 3.0 released XML XSLT now with jQuery plugin
- Forum Scanners - prevent forum abuse
- Automate JavaScript compression with YUI Compressor and /packer/