Block empty User Agent headers
Posted 2008-04-19 in Spam by Johann.
Blocking requests without a user agent header is a simple step to reduce web abuse. I’ve shown before that this can be a significant number.
On this server, 985 requests without user agent were made in the last four weeks which constitutes 6 % of the 14388 blocked requests. 6 % might not sound much but once I started white listing user agents, the percentage of blocked requests went up from less than 1 % to over 4 %. Unless you are also white listing user agents and block entire netblocks as aggressively as I do, your number can be higher.
lighttpd
Blocking empty user agents is simple in lighttpd. Edit your lighttpd.conf
as follows:
Ensure that mod_access
is loaded:
server.modules = ( "mod_accesslog", "mod_access", … other modules … )
Add the following line:
$HTTP["useragent"] == "" { url.access-deny = ( "" ) }
Reload the lighttpd configuration and you’re done.
Apache
Enable mod_rewrite
and add the following to your configuration:
RewriteEngine on RewriteCond %{HTTP_USER_AGENT} ^$ [OR] RewriteRule ^.* - [F]
Contributed by Andrew.
If you use a web server other than lighttpd or Apache, please add the configuration to this entry. Thanks.
7 comments
Forum Scanners - prevent forum abuse
Posted 2008-04-13 in Spam by Johann.
Spammers use forum scanning to find forums that are vulnerable to automated sign-up and posting.
If you are an administrator of a forum and want to prevent automated abuse for your board, then consider blocking the netblocks below.
Forum scanners
10 89.28.14.104 (mail.cigoutlet.us, STARNET S.R.L, MV, 89.28.0.0/17) 10 84.19.180.90 (Keyweb AG, DE, 84.19.160.0/19 and 87.118.64.0/18) 10 77.91.227.113 (WEBALTA, RU, 77.91.224.0/21) 10 77.244.209.198 (gw.fobosnet.ru, RZT Network, RU, 77.244.208.0/20) 10 72.232.162.34 (Layered Technologies, US) 6 87.118.116.100 (Keyweb AG) 6 195.210.167.45 (COMSTAR, RU, 195.210.128.0/18) 5 147.202.28.25 (TEAM Technologies, US, 147.202.0.0/16) 4 87.118.118.173 (Keyweb AG) 4 85.255.115.146 (UkrTeleGroup, UA, 85.255.112.0/20) 2 89.178.154.161 (CORBINA-BROADBAND, RU, Dial-Up) 2 89.149.226.58 (netdirekt e.K., DE, 89.149.192.0/18) 2 89.149.208.221 (netdirekt e.K.) 2 87.99.92.36 (Telenet, LV, Dial-Up) 2 87.236.29.207 (n207.cpms.ru, CPMS Network, RU, 87.236.24.0/21) 2 87.118.120.127 (Keyweb AG) 2 87.118.106.41 (Keyweb AG) 2 84.200.29.124 (Internet-Homing GmbH, DE, 84.200.29.0/24) 2 78.157.143.201 (VdHost Ltd, LV, 78.157.143.128/25) 2 76.120.171.54 (Comcast, US) 2 75.126.166.122 (Softlayer, US) 2 72.9.105.42 (Ezzi.net, US, 66.199.224.0/19, 72.9.96.0/20) 2 72.232.7.10 (Layered Technologies) 2 69.46.23.155 (Hivelocity Ventures Corporation, US, 69.46.0.0/19) 2 69.46.16.166 (Hivelocity Ventures Corporation) 2 69.126.44.157 (Optimum Online, Dial-Up) 2 60.21.161.73 (CNCGROUP Liaoning province network, CN) 2 220.130.142.189 (HINET, TW) 2 200.142.97.194 (Mundivox, BR) 2 195.2.114.31 (MICROLINK, LV) 1 61.235.150.228 (China Railway, CN) 1 60.209.21.101 (China Network, CN) 1 60.190.79.24 (something in China) 1 200.27.116.188 (Telmex Chile, CL)
Notes
- WEBALTA is/was a Russian search engine. Apparently, they also offer hosting.
- A ton of abuse is coming from UkrTeleGroup, not just forum scanning.
- Forum scanners seem to like the Keyweb AG.
- This list was compiled over several weeks.
- The first number roughly lists the scanning frequency.
- I don’t have a forum on
johannurkard.de
.
The Making of a legendary Baguette
Posted 2008-04-08 in Philosophical Contemplations by Johann.
Not that I would want to endorse meat products but this is one good looking, well thought out über-baguette. Congratulations.
Pictures by Fred, used with permission.
26 comments
White listing User Agents to combat Spam Bots and Scrapers
Posted 2008-03-21 in Spam by Johann.
IncrediBill mentioned white listing user agents to block spam bots and other types of abusers. I then tried out white listing as opposed to black listing as I did before.
Here is a short explanation of the difference between black listing and white listing.
Black listing
Black listing means rejecting all requests that fit into a certain pattern.
Black listing looks like this in my web server configuration:
$HTTP["useragent"] =~ "(bad_bot|comment spammer|spambot 1)" { url.access-deny = ( "" ) }
This means that all user agent strings containing bad_bot
, comment spammer
or spambot 1
are served a 403 Forbidden
error message.
White listing
White listing means rejecting all request that do not fit into a certain pattern.
White listing looks as follows in my configuration:
$HTTP["useragent"] !~ "^(Mozilla|Opera)" { url.access-deny ( "" ) }
This means that only user agent strings starting with Mozilla or Opera are allowed, everything else is served a 403 Forbidden
error message.
The downside of white listing is that the number of allowed user agents can be very large. As an example, user agents of Motorola cell phones start with Motorola-
, but some also start with MOT-
, MOTOROKR
or even motorazr
.
Right now, I have more than 100 rules in my white list regular expression.
Which one is right for me?
I recommend using black lists if you cannot spend much time reading log files and changing your web server’s configuration. Black listing, combined with IP blocking of known abusers, can still be effective in limiting bandwidth theft.
Black listing, however, will not prepare you against future bad bots and reincarnations of Russian email harvester outfits. This is where white listing is better.
4 comments
Pages
Page 6 · Page 7 · Page 8 · Page 9 · Page 10 · Page 11 · Page 12 · Next Page »
Subscribe
RSS 2.0, Atom or subscribe by Email.
Top Posts
- DynaCloud - a dynamic JavaScript tag/keyword cloud with jQuery
- 6 fast jQuery Tips: More basic Snippets
- xslt.js version 3.2 released
- xslt.js version 3.0 released XML XSLT now with jQuery plugin
- Forum Scanners - prevent forum abuse
- Automate JavaScript compression with YUI Compressor and /packer/