I got tags working via a plugin.
Since I was messing with the site anyway, I hacked together some Ruby code to pull all the content out of the database and perform automatic keyword extraction via naïve bayesian analysis.
It spat out a file of SQL commands, consisting of the subject of each posting and the first line of text (in comments), followed by the commands to add the tags. I ran through the file in vim deleting here and adding there, then executed the result.
SQL is a dinosaur of a language, designed for the bad old days when computers enforced a fixed size and format for every kind of data, everything was upper case, and if you didn’t like it you went back to using paper. After all, disk space and CPU time were expensive, so you didn’t want people wasting them with pesky unaltered real-world information.
As such, SQL doesn’t have variable-length strings. Oh, sure, it has VARCHAR as well as CHAR, but VARCHAR only gives you a string that can be any size up to some fixed length.
J2EE specifies Enterprise Java Beans for handling data, where the data and the client accessing it may or may not be on the same system. Entity beans are used to encapsulate the data in the database—instead of accessing the database directly, you create an entity bean to do it for you. That way the client can use the entity bean and not need to know about SQL.
However, the person writing the entity bean doesn’t really want to have to know about SQL either.