According to RFC 1738, Section 2.2, a URL is a series of octets. Octets which have no visible ASCII representation, plus other problematic chars, must be encoded as %HH. It doesn’t say anything about character sets.
Form submission seems to encode in the character set of the page, as %HH. So encodeURIComponent() matches form submission if the document is in the UTF-8 character set. Otherwise, it doesn’t.
But if you give IE a mailto: URL, it’ll decode any %HH bytes as if they were ISO-8859-1 (I think). Perhaps it’s using the default charset of the platform. So you have to use escape() – you can’t use encodeURIComponent, otherwise it gets things like U-UMLAUT wrong.
Mozilla, on the other hand, assumes UTF-8. But this means you can’t escape() something into a mailto: if it’s got characters in it between 0x007F and 0x0100. escape() uses %HH (direct encoding of value), whereas Mozilla expects %HH%HH (2-byte UTF-8). You need to use encodeURIComponent. >sigh<
So why does escape() use %uXXXX? What is IE up to? Is there some definitive representation of Unicode characters in URLs, or is it just a contract between client and server? Does anyone know of a document which explains all of this, in words of less than one syllable?