Apr 15

As you have probably noticed, I’ve just gone through a major software migration for my web site.

I was using typo. It was OK, but had a few problems. While its web site describes it as “lean”, that isn’t really the reality. It also relied on a combination of Apache, LigHTTPd and FastCGI that tended to break down without explanation.

The biggest reason for change, though, was that typo’s authors’ idea of what was important functionality was diverging from mine. The wakeup call was when someone spent a bunch of time replacing the regular page templates with templates written in HAML.

For those lucky enough not to know, HAML is a stupid and inexplicably trendy idea in the Rails community, comparable to LiveJournal’s S2 style system. Basically, instead of creating your page templates in HTML and CSS, which everyone can understand and for which there are a zillion useful tools, you instead write program code in a whole new language which has minimal documentation. The program then generates the HTML and CSS.

Of course, this destroys the entire point of template systems, which is to separate code from presentation and make the presentation layer editable by non-programmers using common tools.

I wouldn’t have minded the HAML idiocy so much if it wasn’t for the fact that typo still lacked support for things as basic as user authentication for commenting. So I looked at other web content management software… and looked… and looked.

I tried Blojsom. Supposedly it’s what Apple uses. If so, I hope they’ve done a lot of work on their version, as it’s a major PITA to set up, and very complicated even when you get it working.

In the end, though, I knew the main feature I wanted: OpenID support. Hence, I found myself reluctantly herded towards Wordpress, which has a working OpenID plugin. (Or at least, it works for my OpenID account when I test it. I don’t think it has XRI support, though.)

I did entertain the idea of writing my own CMS. I even sketched out some design notes. But it really is a solved problem, I just didn’t like the technologies used to solve it.

Let’s be blunt about this: I hate PHP, and I hate MySQL. PHP is the Visual BASIC of web programming languages, a mess which grew with no planning out of a quick hack, a kitchen sink language known for its amenability to security holes. MySQL is a toy database, popular because it’s fast, fast because by default it doesn’t actually provide the basic ACID functions a database is supposed to provide. (Sure, you can turn those on, but once you do, today’s PostgreSQL is faster under non-trivial load.)

But I don’t believe in religion, especially not when it comes to software. I’m a strict pragmatist–whatever it takes to get the job done, even if it may offend a few aesthetic sensibilities and fall far short of perfection.

I spend most of my time at work developing using IBM Lotus Notes and Domino. Every time Notes is mentioned on Slashdot, a bunch of people will rant about how bad its UI is. They miss the point utterly. Believe me, the poor UI of Notes is only the most glaringly obvious defect it has; there are far worse problems underneath that the average end user is blissfully unaware of. But you know what? It works. It is sufficient. It lets you build groupware applications and dynamic web sites with fine-grained security in days, not weeks. That is why people use it. The only other tool I’ve found which comes close is Ruby on Rails, and that’s still too immature for me to want to use it on production systems. (That, and it’s surrounded by a community of people who think things like HAML are a good idea.)

So, here we are. I’m editing this in a nice AJAX WYSIWYG editor with spelling checker (an idea shot down by the typo developers), and you should be able to log in with OpenID to comment (an idea the typo developers seem utterly uninterested in).

It took most of Saturday hacking with Ruby, PostgreSQL and MySQL, but I believe I’ve managed to transfer not just all my data, but all your comments too. I think I’ve even managed to keep all the permalinks the same, and preserve all the timestamps. I’ve temporarily lost the tags functionality, but should be able to get it back with another plugin. Hopefully Wordpress will prove more reliable than Typo, and hopefully the OpenID stuff will interoperate correctly with LiveJournal. If not, pray that I inexplicably become independently wealthy and have the time to write something that does the job properly.

Apr 01

SQL is a dinosaur of a language, designed for the bad old days when computers enforced a fixed size and format for every kind of data, everything was upper case, and if you didn’t like it you went back to using paper. After all, disk space and CPU time were expensive, so you didn’t want people wasting them with pesky unaltered real-world information.

As such, SQL doesn’t have variable-length strings. Oh, sure, it has VARCHAR as well as CHAR, but VARCHAR only gives you a string that can be any size up to some fixed length. That’s as opposed to CHAR, which is even dumber, padding out all your strings to a fixed length. So all you gain from VARCHAR is the ability to get “John”, “Smith” rather than “John            ”, ”Smith           ” as output.

So, what happens if Mr Apu Nahasapeemapetilon suddenly joins the company, and your SURNAME VARCHAR(16) overflows? If you’re running MySQL, the name is silently truncated, and your database outputs garbage. If you’re running PostgreSQL, the database chokes and your system falls over with a run-time error, which is what the SQL standard apparently says should happen. So, would you like your leg amputated with a bread knife, or a rusty hacksaw blade?

Let’s suppose that you want to use PostgreSQL, so that you can have advanced features like, oh, the database actually remaining consistent if there’s a crash, and constraints that actually do something. Let’s also suppose that you don’t care too much about data being truncated, so long as you don’t personally have to deal with a support call every time some idiot decides to get creative with a product name.

Now you have a problem, because the program that’s feeding information into the database needs to know how much data it can put in each field without the database choking. The database knows this, of course; it’s part of the schema you used to set up the tables in the first place. But you don’t want the information in two places, because then if you change it in one place, you need to remember to change it in the other.

So, what’s the solution? Surely there has to be one?