Simple virtual hosts (vhosts) with lighttpd
Posted 2007-08-02 in WWW by Johann.
lighttpd, the up-and-coming webserver, offers simple virtual hosts. Here’s how to do it.
Create your root directory
This is the directory all your websites will be stored under.
$ pwd /var $ mkdir www
Create folders for your virtual hosts/websites
For each virtual host, create a directory under the root directory. Do not create directories starting with www.
– these prefixed will be removed.
$ cd www $ mkdir rofl.info $ mkdir blab.la $ mkdir lm.ao
Upload your websites
Well… upload them.
Set the root directory in lighttpd
Open lighttpd.conf
and set the var.basedir
variable.
# pwd /var/www/ # cd /etc/lighttpd # nano lighttpd.conf
var.basedir = "/var/www"
Set up virtual hosts
In lighttpd.conf
set the evhost.path-pattern
key accordingly.
evhost.path-pattern = var.basedir + "/%0/"
Done
Just restart lighttpd or reload its configuration and you’re done. lighttpd will now distribute the requests based on the Host
header to the different directories. Because we used %0
, prefixes (like www.
or w.
) will be stripped.
Where to go from here
If you use an application server like Orion, I suggest reading lighttpd and Java application servers: integrating JSP and Servlets to integrate lighttpd with Java application servers.
lighttpd and Java application servers: integrating JSP and Servlets
Posted 2007-09-09 in WWW by Johann.
In this blog entry, I’ll show you how to integrate lighttpd in a JEE environment. After performing all the changes, lighty will transparently proxy your Java application server.
1. When to use lighttpd
You can use lighttpd to
- secure access to your application server
- reduce load on your server by offloading static requests
- load balance your application servers
- use lighttpd’s spambot and bad bot blocking capabilities
- get more request rewriting and redirecting flexibility
- use the above flexibility to improve your search engine rankings
- profit.
2. When not to use lighttpd
You might not like lighttpd if you
- don’t like configuring software
- use URL rewriting and
;jsessionid
.
3. lighttpd modules you need
The following lighty modules are needed:
- mod_access
- mod_redirect
- mod_rewrite
- mod_proxy
Add them to your server.modules
section:
server.modules = ( "mod_accesslog", "mod_access", "mod_redirect", "mod_rewrite", "mod_proxy", "mod_status", "mod_evhost", "mod_expire" )
4. Denying access to JEE directories
The WEB-INF
and META-INF
directories shouldn’t be accessible through lighttpd. Files from your development environment also shouldn’t be visible.
url.access-deny = ( "WEB-INF", ".classpath", ".project", "META-INF" )
5. Binding your application server to localhost
To prevent duplicate content penalties, your application server shouldn’t be visible from the web. Even if you run it on a high port, someone might eventually find it.
Binding a web site to localhost looks like this in Orion’s <name>-web-site.xml
:
<web-site host="127.0.0.1" port="12345"> <frontend host="johannburkard.de" port="80"/>
Consult your documentation if you aren’t using Orion.
6. Redirecting www.
to non-www.
hosts
Even if you don’t really need to do this, I recommend doing so. Removing duplicate content will improve your rankings.
The following snippet redirects all visitors from www.<domain>
to <domain>
with a 301
permanent redirect.
$HTTP["host"] =~ "^www\.(.*)$" { url.redirect = ( "^/(.*)" => "http://%1/$1" ) }
You should also redirect all additional domains (johannburkard.com
, johann-burkard.org
) to your main domain.
7. Proxying dynamic requests
We will use mod_proxy
to proxy some requests to your Java application server.
Depending on your site’s structure, one of the following approaches will work better.
Simple JSP
If all you have is a bunch of Java Server Pages, the following mod_proxy
rule is sufficient:
proxy.server = ( ".jsp" => ( ( "host" => "127.0.0.1", "port" => "12345" ) ) )
Note that the JSP must be actual files. You cannot use Servlets mapped to these URIs.
Applications
If you use Servlets or more complex applications, you can proxy URIs by prefix:
proxy.server = ( "/blog/" => ( ( "host" => "127.0.0.1", "port" => "12345" ) ) )
Proxying with exceptions
If most of your site is dynamic and you have a directory for static content (/assets
, /static
or so), you can proxy all requests except requests for static files:
$HTTP["url"] !~ "^/static" { proxy.server = ( "" => ( ( "host" => "127.0.0.1", "port" => "12345" ) ) ) }
8. Rewriting requests
lighttpd can dynamically rewrite requests. I mostly use this to use default.jsp
as dynamic index file instead of index.html
. Here’s an example:
url.rewrite-once = ( "^(.*)/$" => "$1/default.jsp", "^(.*)/([;?]+.*)$" => "$1/default.jsp$2" )
This is visible at gra0.com and internally rewrites all requests from /
to /default.jsp
(including jsessionid
and query string).
mod_rewrite
can also be used to make URLs shorter. For example, to remove the ?page=comments
query string, I use the following:
url.rewrite-once = ( "^/blog/(.*)\.html$" => "/blog/$1.html?page=comments" )
9. Redirecting requests
You can use mod_redirect
to redirect the user to a different URL. Contrary to mod_rewrite
where the request is rewritten, a 301
permanent redirect will be sent to the browser.
In this example, I’m redirecting requests to an old domain to a new domain:
$HTTP["host"] == "olddomain.com" { url.redirect = ( "^/(.*)$" => "http://newdomain.com/$1" ) }
10. More things to be aware of
- The only IP address in your application server log files should be
127.0.0.1
. If you need the original address, log theX-FORWARDED-FOR
header. - Don’t analyze both lighttpd and application server logs – lighty’s log files already contain all requests.
- You might want to set up virtual hosts sooner or later.
- Use
mod_expire
to make resources cacheable. Doing so can make your site a lot faster and save you money.
2 comments
Yahoo! bots list – all user agents
Posted 2008-01-12 in WWW by Johann.
Yahoo! has a huge number of bots. In this bot list, I’ll try to list all and explain shortly what they do.
Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
Mozilla/5.0 (Yahoo-MMCrawler/4.0; mailto:vertical-crawl-support@yahoo-inc.com)
Image or multimedia crawlers.
YahooFeedSeeker/2.0 (compatible; Mozilla 4.0; MSIE 5.5; http://publisher.yahoo.com/rssguide; users …; views …)
News feed crawler.
Mozilla/5.0 (compatible; Yahoo! Slurp China; http://misc.yahoo.com.cn/help.html)
Crawler for the Chinese Yahoo.
Mozilla/5.0 (compatible; Yahoo! DE Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Crawler for the German Yahoo.
Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
The well known Slurp crawler, probably the most active legit crawler. I don’t see the first one a lot.
Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
LG-C1500 UP.Browser/6.2.3 (GUI) MMP/1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
MOT-V975/81.33.02I MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
SGH-Z130 SHP/VPP/R5 SMB3.1 SMM-MMS/1.1.0 profile/MIDP-2.0 configuration/CLDC-1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
Nokia6610/1.0 (3.09) Profile/MIDP-1.0 Configuration/CLDC-1.0 (compatible;YahooSeeker/M1A1-R2D2; http://help.yahoo.com/help/us/ysearch/crawling/crawling-01.html)
YahooSeeker/M1A1-R2D2
These user agent strings belong to Yahoo!’s mobile web index crawler.
Vodafone/1.0/V705SH (compatible; Y!J-SRD/1.0; http://help.yahoo.co.jp/help/jp/search/indexing/indexing-27.html)
DoCoMo/2.0 SH902i (compatible; Y!J-SRD/1.0; http://help.yahoo.co.jp/help/jp/search/indexing/indexing-27.html)
KDDI-CA33 UP.Browser/6.2.0.10.4 (compatible; Y!J-SRD/1.0; http://help.yahoo.co.jp/help/jp/search/indexing/indexing-27.html)
Some more mobile web crawlers, probably specific to the Japanese Yahoo.
Y!J-BSC/1.0 (http://help.yahoo.co.jp/help/jp/blog-search/)
A blog spider.
Yahoo! Slurp/Site Explorer
Bot that verifies site authentication through Yahoo! Site Explorer.
4 comments
Pages
Page 1 · Page 2
Subscribe
RSS 2.0, Atom or subscribe by Email.
Top Posts
- DynaCloud - a dynamic JavaScript tag/keyword cloud with jQuery
- 6 fast jQuery Tips: More basic Snippets
- xslt.js version 3.2 released
- xslt.js version 3.0 released XML XSLT now with jQuery plugin
- Forum Scanners - prevent forum abuse
- Automate JavaScript compression with YUI Compressor and /packer/