MVEL Templating Introduction
Posted 2009-01-29 in Java by Johann.
MVEL is an expression language – similar to OGNL – and a templating engine.
I’d like to give you an example of MVEL Templates in this post so you can find out if MVEL might work for you.
Templating Examples
This is how templating with MVEL looks like.
Basic Object Access
<h1>@{name}</h1>
Simple Iteration
<p> @foreach{index : alphabetical} <a href="@{index.uri}">@{index.description}</a> @end{} </p>
Accessing Static Methods
<a href="@{ua.pageURI}"> @{org.apache.commons.lang.StringEscapeUtils.escapeHtml(ua.name)} </a>
Inline Ternary Operator
<li> @{ua.hitsTotal} total @{ua.hitsTotal == 1 ? "Hit" : "Hits"}. </li>
MVEL Integration
The following code integrates MVEL into your application. The first part parses a template from a String
, the second part applies an object to the template and writes it to a file.
public class MVELTemplateWriter {
private final CompiledTemplate template;
/**
* Constructor for MVELTemplateWriter.
*
* @param template the MVEL template
*/
public MVELTemplateWriter(String template) {
super();
this.template = TemplateCompiler.compileTemplate(template);
}
/**
* Merge an Object with the template and write the output
* to f
.
*
* @param o the Object
* @param f the output File
*/
public void write(Object o, File f) throws IOException {
String output = (String)
TemplateRuntime.execute(template, o);
Writer writer = null;
try {
if (!f.getParentFile().exists()) {
boolean created = f.getParentFile().mkdirs();
assert created;
}
writer = new OutputStreamWriter(new
FileOutputStream(f), "UTF-8");
writer.write(output);
}
finally {
close(writer);
}
}
}
You use this code like you would use other templating engines/expression languages: You add your objects to a Map
and then merge the Map
with a template. In the template, you reference the objects in the Map
by their key.
Note that the template is pre-compiled for performance reasons. You can use something like FileUtils.readFileToString(File)
to read a template file into a String
.
Summary
Good
I liked:
- Speed is excellent. Most of the time when building the User Agent Database is spent writing graphs and parsing log files however.
- Clean syntax. Cleaner than everything Sun has ever produced, but probably not as clean and simple as Velocity.
- Supports arbitrary methods. Velocity makes it hard to use
static
methods and does not support operations on arrays at all.
Bad
Not all is nice however. I did not like the following:
No streaming output. All output is cached in RAM before it can be written to a file.
Do you use a templating engine/expression language? Maybe you use Velocity, OGNL, FreeMaker, StringTemplate or something else entirely? Please post a comment if you do.
9 comments
Java Wildcard String Matching
Posted 2008-12-15 in Java by Johann.
This entry contains code examples of Java pattern matching with wildcards.
Note that wildcard matching is not the same as the .*
regular expression which matches any number of characters – a wildcard matches only one character. Wildcards are usually encoded as .
, but the actual value may vary across libraries.
StringSearch
StringSearch 1.2 comes with two wildcard pattern matching algorithms, BNDMWildcards
and ShiftOrWildcards
. Generally, BNDMWildcards
will be faster, which is why I removed ShiftOrWildcards
in version 2.
public void testStringSearch() { BNDMWildcards bndm = new BNDMWildcards(); Object compiled = bndm.processString("bla.blorb"); // "bla?blorb" for StringSearch 1.2 assertEquals(3, bndm.searchString("la bla0blorb null", "bla.blorb", compiled)); }
java.util.regex.Pattern
The java.util.regex.Pattern
API isn’t very compact, but of course it does offer more than just wildcards.
public void testJavaUtilRegex() { Pattern searchPattern = Pattern.compile("bla.blorb"); Matcher m = searchPattern.matcher("la bla0blorb null"); assertTrue(m.find()); assertEquals(3, m.start()); }
Jakarta Oro
ORO offers many PatternMatcher
implementations. In this example, I am using the Perl5Matcher
class.
public void testJakartaORO() throws MalformedPatternException { Pattern p = new Perl5Compiler().compile("bla.blorb"); Perl5Matcher matcher = new Perl5Matcher(); assertTrue(matcher.contains("la bla0blorb null", p)); MatchResult result = matcher.getMatch(); assertEquals(3, result.beginOffset(0)); }
Case Insensitive Search
All of the APIs presented here support case-insensitive string matching. The case insensitive option is simply compiled into the pattern in the compile phase.
StringSearch
This example requires StringSearch version 2 or greater.
BNDMWildcardsCI bndm = new BNDMWildcardsCI(); Object compiled = bndm.processString("bla.blorb"); …
java.util.regex.Pattern
Pattern searchPattern = Pattern.compile("bla.blorb", Pattern.CASE_INSENSITIVE); …
Jakarta ORO
Pattern p = new Perl5Compiler().compile("bla.blorb", Perl5Compiler.CASE_INSENSITIVE_MASK); …
Apache Lucene - the ghetto search engine
Posted 2008-01-15 in Java by Johann.
The background
Back in 2002, I started using Lucene as the search engine on my site.
I tried the little demo application to index my pages and – to my surprise – it couldn’t even index Java Server Pages. Think about it: A Java application that cannot index Java Server Pages.
Anyway, I wrote a working parser for JSPs and emailed it to Doug Cutting, Lucene’s inventor. I never heard anything from Doug which I found rather rude.
Doug…
…Cutting used to work for Excite. I don’t know about you, but I have problems remembering anything about Excite. Maybe because they weren’t really good.
Why Lucene?
In the new layout of johannburkard.de, I will integrate search results in the pages directly. This is caused by a move away from tree-like site structures and towards “relatedness”-based linking.
In other words, instead of forcing a tree-based navigation structure (Home -> Blog -> Programming -> Java
), I will link to pages that are related to the currently viewed page, regardless of their location within the site.
For this to work, search results must obviously be really good.
Unfortunately, Lucene consistently ranked all blog entries about inc above the original entry which caused me to look at the ranking formula of Lucene.
An example of just how much Lucene sucks
To illustrate the obvious ranking problems of Lucene, here are two example documents:
Document 1
Hallo welt hallo, hallo!
Document 2
Ein Hallo-Welt-Programm ist ein kleines Computerprogramm und soll auf möglichst einfache Weise zeigen, welche Anweisungen oder Bestandteile für ein vollständiges Programm in einer Programmiersprache benötigt werden und somit einen ersten Einblick in die Syntax geben. Aufgabe des Programms ist, den Text Hallo Welt! oder auf Englisch Hello, world! auszugeben. Ein solches Programm ist auch geeignet, die erfolgreiche Installation eines Compilers für die entsprechende Programmiersprache zu überprüfen. Aufgrund der einfachen Aufgabenstellung kann ein Hallo-Welt-Programm aber nicht als Einführung in die Sprache selbst dienen, denn es folgt zumeist nur dem Programmierparadigma der imperativen Programmierung und demonstriert somit nur einen Bruchteil der Möglichkeiten der meisten Sprachen.
The question: Which one of these ranks first for “hallo Welt”?
If you guessed document 2, you are wrong. It’s document 1. It ranks better than document 2 by a large margin.
Why? Simply because the frequency of “hallo” over the document is higher in document 1 than it is in document 2.
Sounds stupid? Do you remember keyword stuffing? Search engines in the 90’s were vulnerable to the same problem.
Learning from Lucene’s epic fail
Even if you do not use Lucene, you can still learn from the massive mistakes in Lucene’s design:
- Document numbers are stored as signed ints. This means that Lucene will never be able to index more than 2 billion documents. Two billion documents is just ridiculous for Internet search. Exalead say they index 8 billion documents. Eight billion documents might have been a lot in 2000 or so. Consequently, their results aren't great. Now imagine what their results would look like if they had one fourth of their index size. Still you have lots of people trying to become the next Google using Nutch (which uses Lucene).
- Text is stored in one field. With Lucene, it is impossible to increase or decrease the weight of individual terms in a document. For example, linking to inc with the anchor text “inc” should decrease the weight of inc in the current document.
- Horribly inconsistent API. If I remember correctly, the first versions of Lucene were no interfaces and all final classes or so. In the last years, someone must have done some half-assed refactoring so there is now a Fieldable interface that – uhm – does the same thing as the Field class. W00t. The lesson to learn here is that a bad API can ruin your project for years so don’t let people that are new to Java design Java APIs.
- Using tf/idf. I admit I’m not an information retrieval expert but in all text books that I read (Ricardo Baeza-Yates’ books come to mind), tf/idf was always presented as a basic, “beginner’s” ranking formula that only yields inferior results.
- Locking down APIs. When I experimented with Lucene, I thought to myself “Maybe there is a
CrappyRanking
that I can change toGoodRanking
by callingIndexReader#setRanking(Ranking)
,” then I got lost writing wrappers forQuery
, tried to access theWeight
instance from the wrapper which didn’t work because it is package private, tried to find the ranking formula, found out that by default, 50 results are fetched (hardcoded)… and gave up.
What Lucene does well
Surprisingly, Lucene does a few things really well.
- Indexation and index access performance. I believe index performance is something that is getting better all the time.
- Query analysis. There’s a variety of query parsers available so even complex queries can be parsed.
- Resource usage. I have never noticed any excessive RAM or disk usage.
The verdict
If you plan on using Lucene for anything else than simple site search (“enter keywords, return documents that contain keywords”), you should look somewhere else.
6 comments
The X-FORWARDED-FOR HTTP header
Posted 2008-12-02 in Java by Johann.
X-FORWARDED-FOR
is a HTTP header that is inserted by proxies to identify the IP address of the client. It can also be added to requests if application servers are proxied by proxy servers. In this case, the request IP address is always a local address and the client IP address must be extracted from the request.
Since proxies can be chained – for example if the client’s request is already made through a proxy – the X-FORWARDED-FOR
header can contain more than one IP address, separated by commas. In this case, the first one should be used.
Java Code
The following Java code extracts the originating IP address of an HttpServletRequest
object.
public final class HTTPUtils { private static final String HEADER_X_FORWARDED_FOR = "X-FORWARDED-FOR"; public static String remoteAddr(HttpServletRequest request) { String remoteAddr = request.getRemoteAddr(); String x; if ((x = request.getHeader(HEADER_X_FORWARDED_FOR)) != null) { remoteAddr = x; int idx = remoteAddr.indexOf(','); if (idx > -1) { remoteAddr = remoteAddr.substring(0, idx); } } return remoteAddr; } }
JSPs
In a JSP, the X-FORWARDED-FOR
header can be retrieved as follows:
<%= request.getHeader("X-FORWARDED-FOR") %>
Of course, a Servlet Filter could replace the original HttpServletRequest
with a wrapped version that returns the X-FORWARDED-FOR
value.
Example Request
Here is a full request that was made from 129.78.138.66 through the proxy at 129.78.64.103:
2008-12-01 16:00:59,878 INFO AntiScrape - 129.78.138.66, 129.78.64.103: USER-AGENT: … HOST: johannburkard.de PRAGMA: no-cache ACCEPT: */* ACCEPT-ENCODING: identity VIA: 1.1 www-cacheF.usyd.edu.au:8080 (squid/2.6.STABLE5) X-FORWARDED-FOR: 129.78.138.66, 129.78.64.103 CACHE-CONTROL: no-cache, max-age=604800 X-HOST: johannburkard.de X-FORWARDED-PROTO: http
6 comments
Pages
Page 1 · Page 2 · Page 3 · Page 4 · Next Page »
Subscribe
RSS 2.0, Atom or subscribe by Email.
Top Posts
- DynaCloud - a dynamic JavaScript tag/keyword cloud with jQuery
- 6 fast jQuery Tips: More basic Snippets
- xslt.js version 3.2 released
- xslt.js version 3.0 released XML XSLT now with jQuery plugin
- Forum Scanners - prevent forum abuse
- Automate JavaScript compression with YUI Compressor and /packer/