NetFront 3.5 CSS and JavaScript quirks

Posted 2008-02-25 in Mobile Web by Johann.

NetFront 3.5 on Windows Mobile is giving me some headaches. If you’ve read Linking CSS for handheld devices revisited, you know I’m using JavaScript to prevent the few JavaScript enabled mobile browsers from reading the screen style sheet.

This works well in NetFront 3.1 and 3.3 but not in NetFront 3.5. You can see in this log file snippet that both the screen style sheet and the (large) JavaScript are loaded:

… "GET /blog/…go.html HTTP/1.0" 200 11629 "-" "Mozilla/5.0 (PDA; NF35WMPRO/1.0; like Gecko) NetFront/3.5)"
… "GET /favicon.ico HTTP/1.0" 200 4150 "-" "Mozilla/5.0 (PDA; NF35WMPRO/1.0; like Gecko) NetFront/3.5)"
… "GET /resources/css/handheld.css HTTP/1.0" 200 939 "-" "Mozilla/5.0 (PDA; NF35WMPRO/1.0; like Gecko) NetFront/3.5)"
… "GET /resources/css/screen.css HTTP/1.0" 200 7499 "-" "Mozilla/5.0 (PDA; NF35WMPRO/1.0; like Gecko) NetFront/3.5)"
… "GET /resources/js/jb.js HTTP/1.0" 200 27880 "-" "Mozilla/5.0 (PDA; NF35WMPRO/1.0; like Gecko) NetFront/3.5)"

I was testing the code but the JavaScript works fine.

If I change the well-known linking code from

<script type="text/javascript">
if (/(NetFront|PlayStation)/i.test(navigator.userAgent))
    document.write(unescape('%3C') +
    'link rel="stylesheet" href="handheld.css"\/' + unescape('%3E'));
if (/(hiptop|IEMobile|Smartphone|Windows CE|NetFront|PlayStation|Opera Mini)/i
    .test(navigator.userAgent))
    document.write(unescape('%3C%21--'));
</script>

<style type="text/css">
@import url("handheld.css") handheld
</style>

<link rel="stylesheet" type="text/css"
    href="screen.css" media="screen,tv,projection,print"/>

<!-- -->

<script type="text/javascript">
if (/(NetFront|PlayStation)/i.test(navigator.userAgent))
    document.write(unescape('%3C') +
    'link rel="stylesheet" href="handheld.css"\/' + unescape('%3E'));
if (/(hiptop|IEMobile|Smartphone|Windows CE|NetFront|PlayStation|Opera Mini)/i
    .test(navigator.userAgent))
    document.write(unescape('%3C%21--'));
</script>

<style type="text/css">
@import url("/resources/css/handheld.css") handheld
</style>
<link rel="stylesheet" type="text/css"
    href="/resources/css/screen.css" media="screen,tv,projection,print"/>
<script type="text/javascript" src="/resources/js/jb.js"></script>

<!-- -->

<!--[if IE]>
<link rel="stylesheet" type="text/css" href="/resources/css/ie.css"/>
<![endif]-->

then NetFront 3.5 will at least skip the screen style sheet (but will still load the large JavaScript file).

I don’t want to rely on JavaScript to get the CSS file on the page so NetFront 3.5’s behaviour is not exactly what I’ve expected. Does anyone have more experience with scripting NetFront?

Google Translate bookmarklet: remove Translate links

Posted 2008-02-24 in JavaScript by Johann.

Google Translate is useful to read foreign websites. The only problem I have with the service is that all links are routed through Google Translate. Sometimes, this is not necessary.

I wrote a little bookmarklet that lets you remove the Google Translate link from all of the links on a web page.

Installing this bookmarklet

Windows users: Right-click on the link below and chose “Bookmark this link.”
Apple users: Do the same but restrict yourself to just one mouse button.

Google Translate: Remove links

BlackBerry Simulators: Enabling JavaScript and CSS

Posted 2008-01-19 in Mobile Web by Johann.

I always get a number of hits from BlackBerry browsers. Fortunately, free BlackBerry emulators are available so I can make sure my sites work as expected.

Installing a BlackBerry Simulator

You need to download one or more “BlackBerry Device Simulators” and the “Email and MDS Services Simulator Package.” Downloading is free after registration.

I had no problems installing multiple device simulators on my computer so you can download simulators for all of the BlackBerry devices you can identify in your log files.

After installation, run the “MDS” application from the “BlackBerry Email and MDS Services Simulators 4.1.4” folder and then start any of the device simulators.

You will be asked to set up your virtual BlackBerry the first time you run a simulator.

Enabling JavaScript and CSS

The BlackBerry browser lets you change several aspects of how the device displays web pages.

Loading JavaScript.
Displaying tables.
Using colors.
Support of Cascading Style Sheets.
What CSS media type to load (handheld or screen).

First, select “Applications.”

Then, select “Browser.”

After loading the browser, select “Options.”

Select “Browser Configuration.”

Configure the browser. In this example, I have JavaScript and CSS on. Unfortunately, I have no idea of how the average BlackBerry browser is configured. I would recommend that you identify a small number of sessions and see whether JavaScript and CSS files are loaded.

6 comments

Apache Lucene - the ghetto search engine

Posted 2008-01-15 in Java by Johann.

The background

Back in 2002, I started using Lucene as the search engine on my site.

I tried the little demo application to index my pages and – to my surprise – it couldn’t even index Java Server Pages. Think about it: A Java application that cannot index Java Server Pages.

Anyway, I wrote a working parser for JSPs and emailed it to Doug Cutting, Lucene’s inventor. I never heard anything from Doug which I found rather rude.

Doug…

…Cutting used to work for Excite. I don’t know about you, but I have problems remembering anything about Excite. Maybe because they weren’t really good.

Why Lucene?

In the new layout of johannburkard.de, I will integrate search results in the pages directly. This is caused by a move away from tree-like site structures and towards “relatedness”-based linking.

In other words, instead of forcing a tree-based navigation structure (Home -> Blog -> Programming -> Java), I will link to pages that are related to the currently viewed page, regardless of their location within the site.

For this to work, search results must obviously be really good.

Unfortunately, Lucene consistently ranked all blog entries about inc above the original entry which caused me to look at the ranking formula of Lucene.

An example of just how much Lucene sucks

To illustrate the obvious ranking problems of Lucene, here are two example documents:

Document 1

Hallo welt hallo, hallo!

Document 2

Ein Hallo-Welt-Programm ist ein kleines Computerprogramm und soll auf möglichst einfache Weise zeigen, welche Anweisungen oder Bestandteile für ein vollständiges Programm in einer Programmiersprache benötigt werden und somit einen ersten Einblick in die Syntax geben. Aufgabe des Programms ist, den Text Hallo Welt! oder auf Englisch Hello, world! auszugeben. Ein solches Programm ist auch geeignet, die erfolgreiche Installation eines Compilers für die entsprechende Programmiersprache zu überprüfen. Aufgrund der einfachen Aufgabenstellung kann ein Hallo-Welt-Programm aber nicht als Einführung in die Sprache selbst dienen, denn es folgt zumeist nur dem Programmierparadigma der imperativen Programmierung und demonstriert somit nur einen Bruchteil der Möglichkeiten der meisten Sprachen.

The question: Which one of these ranks first for “hallo Welt”?

If you guessed document 2, you are wrong. It’s document 1. It ranks better than document 2 by a large margin.

Why? Simply because the frequency of “hallo” over the document is higher in document 1 than it is in document 2.

Sounds stupid? Do you remember keyword stuffing? Search engines in the 90’s were vulnerable to the same problem.

Learning from Lucene’s epic fail

Even if you do not use Lucene, you can still learn from the massive mistakes in Lucene’s design:

Document numbers are stored as signed ints. This means that Lucene will never be able to index more than 2 billion documents. Two billion documents is just ridiculous for Internet search. Exalead say they index 8 billion documents. Eight billion documents might have been a lot in 2000 or so. Consequently, their results aren't great. Now imagine what their results would look like if they had one fourth of their index size. Still you have lots of people trying to become the next Google using Nutch (which uses Lucene).
Text is stored in one field. With Lucene, it is impossible to increase or decrease the weight of individual terms in a document. For example, linking to inc with the anchor text “inc” should decrease the weight of inc in the current document.
Horribly inconsistent API. If I remember correctly, the first versions of Lucene were no interfaces and all final classes or so. In the last years, someone must have done some half-assed refactoring so there is now a Fieldable interface that – uhm – does the same thing as the Field class. W00t. The lesson to learn here is that a bad API can ruin your project for years so don’t let people that are new to Java design Java APIs.
Using tf/idf. I admit I’m not an information retrieval expert but in all text books that I read (Ricardo Baeza-Yates’ books come to mind), tf/idf was always presented as a basic, “beginner’s” ranking formula that only yields inferior results.
Locking down APIs. When I experimented with Lucene, I thought to myself “Maybe there is a CrappyRanking that I can change to GoodRanking by calling IndexReader#setRanking(Ranking),” then I got lost writing wrappers for Query, tried to access the Weight instance from the wrapper which didn’t work because it is package private, tried to find the ranking formula, found out that by default, 50 results are fetched (hardcoded)… and gave up.

What Lucene does well

Surprisingly, Lucene does a few things really well.

Indexation and index access performance. I believe index performance is something that is getting better all the time.
Query analysis. There’s a variety of query parsers available so even complex queries can be parsed.
Resource usage. I have never noticed any excessive RAM or disk usage.

The verdict

If you plan on using Lucene for anything else than simple site search (“enter keywords, return documents that contain keywords”), you should look somewhere else.

6 comments

Pages

Page 8 · Page 9 · Page 10 · Page 11 · Page 12 · Page 13 · Page 14 · Next Page »

RSS 2.0, Atom or subscribe by Email.

NetFront 3.5 CSS and JavaScript quirks

Google Translate bookmarklet: remove Translate links

Installing this bookmarklet

BlackBerry Simulators: Enabling JavaScript and CSS

Installing a BlackBerry Simulator

Enabling JavaScript and CSS

6 comments

Apache Lucene - the ghetto search engine

The background

Doug…

Why Lucene?

An example of just how much Lucene sucks

Document 1

Document 2

Learning from Lucene’s epic fail

What Lucene does well

The verdict

6 comments

Pages

Subscribe

Top Posts

Categories

Navigation