May 28

Once upon a time, back in the ancient history of the Internet–before the 1990s–domain names were carefully controlled and regulated. A single organization controlled each top level domain. If you wanted a domain name, you had to meet their requirements.

Often the policies enforced were quite picky. If you wanted a .uk domain name, you were required to actually be in the UK, for example. If you wanted a .org domain, you were required to be a non-profit organization. To be in .net, you were expected to be a network access provider or ISP.

A lot of people disliked the bureaucracy involved in domain registration, and objected to the fees charged. And so it was decided that the domain name system would be opened up. There would be many domain registrars for each major top level domain, all competing to give the best price and service. Anyone would be able to register a domain, with minimal bureaucracy. Domains would be bought, sold and transferred in a perfect Free Market.

At first, things looked good. The cost of registering a domain dropped rapidly. Rather than having to fax paperwork around and get signed documents from company directors, you could just register online with a credit card for whatever domain you wanted.

However, it quickly became clear that domains could have value. A small proportion of Internet users (around 5-10%) don’t seem to understand search engines or bookmarks. They find things by guessing domain names and typing them in. As a result, people found that domain names an idiot would guess first ended up with traffic, purely by existing. Suddenly instead of having to advertise your web site, you could buy a domain name that people would randomly visit anyway, and get instant traffic with no work required.

Domains like “sex.com”, “computers.com” and “cars.com” suddenly became very valuable, changing hands for large amounts of money. Some people weren’t very happy about it, but still, there was nothing wrong with it really.

Unfortunately, there were headline stories of domain names changing hands for millions of dollars. And suddenly, there was a gold rush. Everyone with a modem hurriedly registered every domain name they could think about.

This was a major pain. If you wanted to set up a web site, it became almost impossible to find a simple domain name that hadn’t been registered already. Almost all of them were unused, just a whois entry and nothing more, but if you approached the owner their eyes would light up with dollar signs and they’d demand extortionate rates for their “valuable property”.

Still, the situation was somewhat self-correcting. It did still cost $50 or so to hold a domain for a year, so eventually when nobody turned up to offer $100,000 for it, the holder would let the registration lapse and you’d be able to pick it up for $50.

Then someone invented banner ads. Suddenly, those unused domains could be used to make money. Domain registrations were still dropping in price, and there were ad companies who would pay you $0.01 each time you served up an ad to someone. $10 a year for a domain, and all you needed to do was show ads to at least 1,000 idiots who typed your domain in at random, and you’d break even.

And so suddenly, the Internet filled with junk web pages filled with ads and nothing else. There are now multi-million-dollar companies whose primary business is hoarding domains and filling them with content-free crap. Domain spam is now so mainstream that companies like Google actively encourage it.

The next step was obvious. Sure, you could think of a domain name that other people would be likely to guess at random, but most of those were already registered. So the domain spammers began watching the lists of domains that people failed to renew. So now, if a widely used open source project fails to renew its domain name, the page will suddenly be replaced with a spam site full of affiliate ads.

Not everyone appreciates ending up on a domain spam page, however. Plus, if your page doesn’t look like total spam, you might get search engine traffic, and boost your profits further. Hence, the new trend is automatic content generation.

Some domain speculators take the unsubtle approach, and simply rip off content wholesale. If you have a web site with significant readership (as measured by, say, technorati), someone will likely set up a spam site which copies the text of each post you make, covers it with ads, and re-posts it to one of their hoarded domains. Sure, it’s copyright violation, but the chances of getting caught are slim, and so long as you pick on personal web sites the chances of anyone going after you with a lawsuit are slim too.

(I don’t think it has happened to me yet, but if I include a made-up word that doesn’t appear on the web, like spozquak, I should be able to do a Google search in a month or two and see if anyone’s copied it.)

However, again thanks to the free market, there’s now a market for software that can generate moderately convincing looking content. You’ve seen it in spam e-mails, and now it’s being used to fill the web too. The first generation used random text generation, but now more sophisticated “auto content generator” software uses web feeds to pull in text, chops the text into individual sentences, and then recombines them based on keywords.

(So I guess I should clarify that spozquak is a great alternative to viagra, cures mesothelioma from asbestosis, and helps you make money at home.)

While the web was filling with crap, the domain name registrars kept competing in their free market. As the supply of new unregistered .com domains dried up, they had to think of new ways to pull in customers. The solution: trial periods. You can now register a domain name for a 5 day trial, see if it pulls in any suckers, and if not you don’t have to pay for it.

You can probably guess what happened next. Someone wrote software to repeatedly register domains for trial periods, automatically.

And so we arrive at today’s web, the ultimate result of applying unconstrained free market economics to the problem of naming web sites. It’s a world where every name you can think of is already registered and filled with spam, often by someone who isn’t even paying for the domain. A world where if you’re away on holiday when your domain name expires, it’s immediately filled with spam. A world where web searches return hundreds of pages filled with spam designed to look like content, ripped off from other people’s web sites.

Of course, there are a couple of things we could do that might help ameliorate the problem. They’re just utterly unacceptable to the free market faithful who make up the Internet’s core audience.

The first is this: Do not allow domain transfers between third parties.

You bought a domain? Great. You want to sell it? Can’t. I mean, you can’t sell your home address, your postal code or your telephone number, so why should you be able to sell a domain name?  Your friend wants the domain? Fine, you cancel it, he registers it for the standard price.

If you could sell telephone numbers, you’d see rampant speculation there as well. If you moved to Austin and wanted a 512 phone number so friends could call you without paying long distance fees, you’d probably have to buy one at auction for a few hundred dollars. Or if you were in Massachusetts and wanted one of the old 617 numbers so you’d look like a long-established business, you could end up paying thousands of dollars. But the phone company doesn’t allow reselling of phone numbers, so the problem doesn’t occur.

(It’s worth noting that you can sell toll-free numbers. And sure enough, you get rampant speculation in that chunk of the phone number namespace, with most of the good ones already taken.)

The second way to help reduce the damage caused by the free market in domains is to resurrect an idea from the 80s: that your domain registration is voided if you don’t actively use the domain. And by “use”, I mean more than simply putting up a blank page of ads.

I can tell that people are already sharpening their pitchforks and lighting their torches, but which is worse: a domain name system that doesn’t support your religious belief that a free market is the best solution to everything, or a free market domain name system where you can’t actually buy any domains you want and everything is full of spam?

Mar 04

Donald Norman is an expert in human computer interaction, user interface design, cybernetics—call it what you will. His book The Design of Everyday Things is a classic, and taught me how to shop for a refrigerator. (Seriously.)

However, his recent rant about Google is just plain wrong. His basic point—that it’s easy to make a simple interface to a system that only does one thing—seems sensible enough. But is it true? A clock only does one thing, and think of all the digital clocks you’ve seen that flash 12:00 endlessly because nobody can be bothered to take the time to work out how to set them. Norman’s own book has examples of doors and faucets with badly designed interfaces, and those only do one thing.

When you look at Norman’s specific criticisms of Google’s interface, they don’t stand up:

[...] because all those other things are not on the home page but, instead, are hidden away in various mysterious places, extra clicks and operations are required for even simple tasks—if you can remember how to get to them.

[...] Want a map? You have to click once to be offered the choice, then a second additional time to get to the map page.

Well, no, you don’t. Go to the front page of Google, and type map of austin texas. The first link on the search results will take you straight to Google Maps to view the map you were asking for. Similarly:

Want to use Google Scholar to check references? Um, well, is that “Advanced Search” or “more.”

Who cares? Just type scholarly reference check and the reference, and hit enter. Or if you’ve heard that there’s a thing called Google Scholar, type google scholar.

To be fair to Norman, blog search isn’t integrated into Google yet. Give it time, I’m sure they’ll make it so you can type dave winer's blog and get a link to Google Blog Search on the results. (Actually, you do, but only in the ads.)

Is this a simple ‘intuitive’ interface? I think so, in as much as any search engine interface is intuitive. The model is “Go to the text box on the front page, and type in what you want.” Pretty simple. Yes, it relies on typing in what you want instead of just clicking, but sometimes that’s the right approach.

If it wasn’t for the fact that I’m using a Mac right now, I might be tempted to suggest that too much GUI can rot the brain and make people think that every web interface has to be a clicky-clicky one, or that that’s always the best way to use a web site. Personally I suspect that Google’s service links on the front page are more there to reassure those who would otherwise assume it was just a dumb web search engine.

May 25

Bram Cohen’s official BitTorrent search engine is now open. To celebrate this event, I suggest we have a contest to guess (a) the date of the first cease-and-desist lawsuit from the RIAA or MPAA, and (b) the date when the site gets shut down due to crippling legal costs.

I’m predicting June 1st and October 1st, respectively.

Sep 02

According to Google Watch, our favorite search engine is dying. Supposedly Google is not indexing anywhere from ten percent to seventy percent of the pages it knows about.

Well, those are some pretty huge error bars, which right away scream out “wild speculation”. But if we read on, the guy offers as evidence the fact that his web site, namebase.org, appears as a bare URL in the Google index, rather than having the conventional snippet of content and careful indexing.

It sounds pretty convincing, until you go look at the source of his web page:

<HTML><HEAD>
<META HTTP-EQUIV=”EXPIRES” CONTENT=”0″>
<META NAME=”ROBOTS” CONTENT=”NOARCHIVE”>
<META HTTP-EQUIV=”PRAGMA” CONTENT=”NO-CACHE”><TITLE>NameBase Book Index</TITLE>

Yes, the idiot has specifically set headers on his web site to try to tell web crawlers not to archive any of the text, and not to cache anything—and then he complains that the Google index doesn’t have a cached index of his content in its archive.

Of course, it goes without saying that his site’s home page also isn’t valid HTML, failing to validate against any DTD whatsoever.

So in conclusion: (a) Google doesn’t always do a good job of keeping an indexed cache of things you’ve told it not to cache, and (b) Google doesn’t always do a good job of indexing things that aren’t actually web pages.

I mean, I’m not delighted that I’m no longer the #1 spot in a search for my name, but I don’t take it as proof that Google is broken.

I’ve been wondering for a while why it is that people are so keen to predict that Google is ruined, hopeless, shooting itself in the foot, lost, going to crash, it’s doomed, doomed, doomed I tell you!

I think it’s more than http://en.wikipedia.org/wiki/Schadenfreude”>schadenfreude. I think Google irritates a lot of people because of the way they’ve put together a successful business by ignoring business rules and behaving ethically. They’ve refused to pollute search results with ads; they’ve refused to let people buy their way to the top; they’ve refused flashy graphical banners. They’ve thrived anyway. And a lot of people hate them for it.

Apr 25

My web hosting provider exploded. The company who supposedly bought their customer lists has failed to get things going after a week or two. So, I need a new web host.

Requirements:

  • Linux or UNIX based
  • SSH access and rsync for uploading my site
  • Low monthly fees
  • No price gouging for extra bandwidth
  • Low or zero setup fee
  • One domain, at least 3 subdomains
  • At least 2 POP3/IMAP mailboxes
  • A reasonable amount of space (50 MB or more)
  • SpamAssassin

Nice-to-have features:

  • Usenet/NNTP access and server
  • Jabber server
  • Logs and stats
  • Search engine support for sites

I don’t need PHP, JSP, ASP, SSI, SQL, CGI, … Just static web pages, cheap. Also:

  • I don’t need DNS hosting or domain registration, just the servers to point my existing domain at and a working MX or two.
  • High uptime isn’t all that important; the odd day or so of downtime is fine.
  • Phone support isn’t important.
  • Technical hand-holding won’t be needed, obviously.

I’ve found:

Frankly, I’d be tempted to do it myself if we weren’t planning on moving in the near future…

Sep 17

Verisign, possibly the most incompetent name registrar on the Internet (but that’s another story), have decided to leverage their monopoly control over the current de facto standard root DNS servers.

They’ve set things up so that any nonexistent domain name now maps to one of their servers. If you type a random bogus domain name like xyloturbot.com into your web browser, you now get Verisign ads and a pay-for-hits search engine.

This is bad for many reasons. Firstly, they’re violating at least four different RFCs, including the Requirements For Internet Hosts. Secondly, they made the change without warning, breaking many anti-spam systems that were checking to see if alleged sender e-mail addresses look valid.

As if that wasn’t bad enough, spam sent with completely bogus addresses now ends up queued indefinitely on many mail servers—rather than bouncing it immediately as it’s to an invalid sender, they can now resolve every single bogus address, so they’ll queue the mail and try delivering it for a couple of weeks. There are probably lots of servers out there that aren’t given much attention, that are now gradually filling up with spam thanks to Verisign.

Another problem is that it gives the Internet a single vector for massive virus infestation. Imagine if a hacker cracks the Verisign web server and puts a new Windows virus on that server for download—it could spread across the entire Internet in seconds.

Finally, what they’re doing is probably illegal under the anti-’cybersquatting’ laws passed in the USA. They are, after all, squatting on other people’s trademarked names, in order to make cash.

There are already patches for most DNS servers to permanently blackhole the Verisign machine in question. It took IBM less than a day to decide to blackhole all traffic to that server, and according to the software authors the clamor for patches has been enormous. It’ll be interesting to see how the crooks respond.

In the meantime, it seems to me that the best thing to do is take advantage of the situation. Since every bogus e-mail address now resolves, and since all the incompetently-managed open relay servers will end up sitting delivering e-mail to Verisign 24×7, why not generate a few hundred bogus e-mail addresses every day, link to them on well-trafficked pages (like this one), and wait for the spambots to harvest them? In fact, you may already have spotted me doing just that…

Jan 01

It has been alleged that I’m unthinkingly rude and negative about the rich, famous and successful. To disprove that assertion, here’s the first of a series of articles.

Five Admirable CEOs

  1. Aaron Feuerstein, CEO of Malden Mills.

    In 1995, a fire burned the Malden Mills factory to the ground. Everyone thought they were out of work, but no. The company CEO kept all the employees on the payroll until the factory could be rebuilt. Wear your Polarfleece with pride!

  2. Paul Fireman, CEO of Reebok.

    The contrast between Paul Fireman of Reebok, and the weaselly Phil Knight of Nike, couldn’t be stronger. Knight publically welched out on a deal he made with Michael Moore on camera, continues to use sweatshop labor without apology, and hijacks events like the Boston Marathon for publicity without paying anything in sponsorship fees.

    Fireman, on the other hand, is an active member of Amnesty International. He has written articles for business publications stressing the importance of human rights, and supporting the right of workers to unionize. Reebok sponsors many AI events, and Reebok board members have stood for election to serve on the board of Amnesty, with the company’s approval.

    Sure, the company’s not perfect. It still makes its shoes in third world countries, and has plenty of critics. But in an industry where margins are wafer thin and competition is extreme, little gestures like paying your laborers 24% above minimum wage mean a lot.

  3. Akio Morita, founder of Sony.

    No grand humanitarian gestures here. Just a company that, after Apple, is the most consistently brilliant at creating beautifully designed high-tech devices of reasonable quality. Morita was an engineer, responsible for inventing the Walkman, a device that I think has changed everyone’s environment in surprising ways. His company also gave the world the transistor radio, the VCR, and many other devices we now take for granted. In the process, it changed the perception of the words “MADE IN JAPAN”. Morita built Sony from the ground up, and maintained a punishing schedule right up to his death in 1995.

  4. Sergey Brin, founder of Google.

    I’m sure everyone reading this knows how wonderful Google is. Sergey Brin is the “moral compass” of the company, trying to do the right thing in a world where the search engine’s visibility has made it a magnet for lawsuits and commercial temptations. I think, by and large, he’s succeeded.

  5. The Kashio brothers, founders of Casio

    Tadao Kashio founded Casio with his three younger brothers; Kazuo is now the CEO. It’s still a family business.

    What I love about Casio is that they’re the poor man’s Sony. They have consistently produced quality, reliable products at low prices. It’s hard to imagine now, but a reliable wristwatch or calculator used to cost hundreds of dollars. I think the company’s biggest gift to the world, however, was putting cheap-but-good synths and samplers into the hands of thousands of musicians in the 80s and 90s.

    To price products way lower than the market required, build them better than necessary, and yet survive and thrive on razor-thin margins, is an amazing accomplishment. To keep the company in the family while doing so is astonishing, even for Japan.

Aug 10

It seems that Microsoft Internet Explorer keeps a record of all web sites you ever visit, and all search engine terms you type in to any search engine—even if you tell it to clear the history! It also collects all your cookies from every site you visit, in a separate set of secret folders hidden away from the normal cookie folder—so even if you think you cleared out your cookies, you probably didn’t.

They’ve clearly gone to a lot of work to prevent people finding these hidden files too—they’re specially flagged to stop them displaying in the DOS shell or Windows. Even if you unflag them, special code in Windows will hide them again next time you reboot. The code is hidden away in rundll32.exe, which is supposed to be just the tool that runs 32-bit DLL libraries. Sneaky or what?

In fact, only the old Windows Explorer program (left over from Win3.x) will show the directories. Even then, Windows is specially patched to prevent you from looking at the files unless you copy them somewhere else first!

So what’s in these files? Well, looking at my own machine, I see a log of sites I’ve visited that I know I haven’t been to this year, and searches for stuff I was researching last year as well. There are I can think of no legitimate reason for this information to still be stored in database files on my disk. Even ignoring the possible privacy implications, all this unencrypted secretly logged data represents a significant security risk. Do I want anyone who gains physical access to my machine to be able to get my online banking account details? I don’t think so.

For more information and a guided tour of what Microsoft have secretly stored on your hard disk, see <URL:http://www.f***windows.com/content/ms-hidden-files.shtml> I think I’m about to switch browser, now that Mozilla seems stable enough to use… I’m glad I’ve never used Outlook Express.