OpenCalais followup

As I wrote a while ago, I’ve been experimenting with the Calais Auto Tagger plugin which interfaces WordPress with OpenCalais. After writing a post, you push a button and it looks at the text and comes up with some tags. After running it over my entire site, I ended up with over 5,000 tags. Eat that, LiveJournal. However, there’s quite a lot of crap in there, and I’m going to need to do some cleanup.

Software and religion

As you have probably noticed, I’ve just gone through a major software migration for my web site. I was using typo. It was OK, but had a few problems. While its web site describes it as “lean”, that isn’t really the reality. It also relied on a combination of Apache, LigHTTPd and FastCGI that tended to break down without explanation. The biggest reason for change, though, was that typo’s authors’ idea of what was important functionality was diverging from mine.

LJ Abuse (again)

Another user gets suspended from TrollJournal for posting public information, info that had been made public by the person detailed. The backstory is that LiveJournal has introduced advertising in the form of “sponsored communities” with third party identity tracking. To quote the LiveJournal “contract” back in 2004: It may be because it’s one of our biggest pet peeves, or it may be because they don’t garner a lot of money, but nonetheless, we promise to never offer advertising space in our service or on our pages.

Be a web whiz

Quote from actual conference session description: The mellow flavor of Swiss cheese enhances sandwiches, soups, and sauces. Experts know how to control the size of the holes in Swiss cheese by changing the acidity, temperature, and curing time during the complicated holeforming fermentation process. Similarly, site visitors savor the experience of interacting with enterprise content and traditional HTML and graphics on web pages. The title also uses “surfacing” as a verb.

Tinfoil hat alert

According to Google Watch, our favorite search engine is dying. Supposedly Google is not indexing anywhere from ten percent to seventy percent of the pages it knows about. Well, those are some pretty huge error bars, which right away scream out “wild speculation”. But if we read on, the guy offers as evidence the fact that his web site,, appears as a bare URL in the Google index, rather than having the conventional snippet of content and careful indexing.

Internet Explorer security hole

New Windows / Internet Explorer security hole: Upload any Windows executable you like to a web server. Set up the web server to send .exe files as text/html. Put a CLSID in the filename. Post links to the file, cloaking them as via the previously announced URL cloaking bug. Wait for anyone using Internet Explorer to click on the innocent-looking link and get asked if they want to open the HTML web page.

Busy with optical media

Spent Friday morning clearing up a random disaster… some legacy application that suddenly needed to go out on CD, that had never been designed to run on read-only media. Then I went to the MIT lunch trucks. I spent the afternoon continuing to learn J2EE and SQL. I now have a simple user registration / login application written, which uses request dispatch and HTML files for look and feel. The book I’m learning from is OK on the Java stuff, but does a really poor job of teaching good systems design; they have all their HTML shoved into the servlets.

Fear my pagerank

My web site has been around since the days of HTML 1.0. It’s been at the same URL for nearly a decade, and is linked to by thousands of copies of FAQ documents. Because of this journal and the photos page, it’s also updated regularly. As a result, and thanks to Google’s pageranking algorithm, I’m the first thing you get if you Google search for “mathew”. I’ve noticed that this means that anything moderately obscure I mention in my journal will soon end up near the top of the Google search results for the appropriate keywords.


Got home, booked tickets to Minnesota. It’s funny, when I married Sara I didn’t really think about the fact that it would mean visiting Minnesota every other winter. Not that I’d have decided differently; I’m just amused that it didn’t occur to me. Also fixed my web site. The Perl script rewrote most the HTML for AT&T’s web servers, but I had to change a few URLs in my LiveJournal template and fix the redirection at pobox.