More “Transmittable” Short URLs

URL shortening services are very popular. They basically redirect a short URL – e.g. bit.ly/ABC123 – to a longer URL. (And keep logs, which can be monetized, hence Twitter’s t.co service and the requirement that tweets use it.) Most URL shortening services use the following set of characters in the unique tag: [A-Za-z0-9] – a total of 62. 6 characters is a normal number for the tag.

However, when reading such a short URL to someone, e.g. over the phone or across a conference table, a couple of problems can occur:

  • The person reading may misread; they may read “l” for “1″ or “0″ for “O”.
  • The person reading may under-specify, most commonly by not expressing the case

This makes reading out such short URLs a pain, as one has to make sure to specify case correctly, and to distinguish between similar-looking characters in a possibly-unfamiliar font, or in handwriting. These problems could be avoided, and URL reading would be much easier, if the set was instead [a-km-np-z0-9], and the shortener service treated a submitted tag as case-insensitive.

This would give a choice of only 34 characters. Surely that would mean the tag would have to be much longer? Actually no:

  • 62^6 = 56800235584
  • 34^6 = 1544804416
  • 34^7 = 52523350144

Short URLs could be made more transmittable at the cost of only being 1 character longer. I think some service might find that worth doing…

10 thoughts on “More “Transmittable” Short URLs

  1. Good idea. You can go one step further and format the string like “abcd-1234-defg”. Optionally the server could ignore the “-” when entering an URL.

    • I’m not sure about people living in other parts of the world, but I find that splitting it up like a North American phone number, sans area code (1a3-br72) would provide the most memorable and comfortable reading cadence.

      The only concern there would be whether the return on investment is high enough to make the added dash an easy sell for shorteners associated with Twitter and other SMS-compatible services like StatusNet where every character counts.

  2. I always thought the main purpose of those dreadful things was to prevent you from knowing where the URL points unless you follow it. (With good software it is of course possible to follow it halfway, but less than 1% of the population knows how to do that, and it’s an added hassle.)

  3. Forget short URLs, they are bad. As Jonadab said above, it’s way too easy to make it point to an undesirable page or website (think CSRF or XSS). I never accepted to follow such short URLs, unless I really trust the one who posted it.

    • Why is a short URL particularly bad in this respect? Any URL could redirect you to any other URL, and any URL, long or short, could contain attack code.

      • I agree with you on that point, but URL shorteners do still have two related weaknesses that aren’t directly their fault:

        1. Most sites don’t bother to set a title attribute on links, so people have grown used to hovering and using the URL to check where a link with ambiguous text goes.

        2. URL shorteners add a big single point of failure to the system. (Hence why I keep seeing recommendations that every site which uses shortened URLs should run their own shortener… even if it’s just some little personal blog which can’t justify the cost of a separate domain and has to add a /l/ route to the main domain.)

        Personally, I like the Identi.ca approach. They do have their own shortener (ur1.ca) but they also put the original URL in the title attribute (both in the web view and in feeds) if at all possible, to mitigate that risk.

      • > Any URL could redirect you to any other URL, and any
        > URL, long or short, could contain attack code.

        In theory, perhaps.

        In practice, the URL almost always contains useful information that tells you something meaningful about the destination, information that you can (and that I regularly do) use to decide whether to follow the link or not. URL shortening services deliberately take that information away and provide basically nothing in exchange. I don’t like them.

        In fact, more than half the time I just skip past any link so obscured, because it’s not worth the hassle of following it only to find out whether I want to bother with following it or not. Furthermore, when I do follow them, it’s almost always from Twitter, where the link text contains the first and most important part of the actual final destination. In virtually any other context, I don’t give “shortened” (i.e., obscured) URLs the time of day.

  4. I once designed a related system (though for a different purpose). I also used the constraint that the verbal representations of letters could not be too similar. So I disallowed using both B and D, for example, since they can be easily confused on a noisy connection. In fact, when entering one of these ids, I accepted either B or D but mapped them to the same token. You could do the same by accepting S in place of 5, for example.

    I’d need to dig up my notes for my final scheme, but things you allow that I disallowed were B/D and M/N.

    • Would you mind digging up your notes? I’m kind of curious what else would be in them.

      I know S/F would. I’ve often had to correct people on the phone that my last name isn’t “Fokolow”.