Note: this isn’t the full picture; some people have been very helpful in pointing out stuff I’ve missed. Please continue to send in feedback, and I’ll post an updated version in a couple of days.
I have been attempting to figure out what percentage of the net population will get the upcoming Firefox 3 (June 17th!) in their native language (“heart language”). We’re doing 48 different localizations. I’ve attempted this before, but I have been limited by the quality of data available. No-one seems to have good statistics on the language breakdown of the net population.
So, I’ve taken net population figures for each country (232 of them; that number seems high, but I guess they have a generous definition of ‘country’) from the CIA World Factbook, and split them up by the language split in that country. This assumes, therefore, that the distribution of the net population in a country is in the same proportion to the languages spoken there. I’m sure for some countries that’s a bad assumption, but I still think the resulting data is better than what I had before.
Here’s a snapshot of the spreadsheet (.ods). Headline figures:
|Firefox 2 and 3 together||48||89.2%|
|L10n projects with CVS access||63||93.7%|
|All localizations found, including unofficial||61||92.6%|
Methodological notes: Many figures estimated. Unknown speakers allocated proportionally among the languages for which there is a figure. I don’t mean to insult anyone – if I’ve overlooked your localization, I apologise. Void where prohibited. Blame Canada.
OK, now the questions:
- Why does FF3 say 45 and not 48?
- I only have one column in my spreadsheet for each of English, Portuguese and Spanish. IE has two Portugueses, but only one Spanish and one English. If we split them and said that IE didn’t support British English or South American Spanish, we’d gain a large advantage which I don’t think would be really reflective of the truth. But perhaps some Spanish speakers in Latin America want to argue with me there :-)
- Why is FF2+FF3 more than either FF2 or FF3?
- FF3 has Indonesian, Sinhala, Albanian and Serbian (welcome!), which FF2 doesn’t have. FF2 has Bulgarian, Welsh and Persian, which FF3 doesn’t have (yet).
- How come “L10n with CVS access” is more than “Total localizations found”?
- Because some teams with CVS access don’t seem to have produced a localization yet, and that more than counterbalances the unofficial localizations I was able to find on addons and by searching the web.
- Which language communities do we serve which IE does not?
- Hello to native speakers of Belarusian, Frisian, Kurdish and Mongolian. We’re meeting your needs :-)
- Why is our percentage lower than IE’s?
- There are a couple of relatively big languages (as well as a dozen smaller ones) that we don’t have official localizations for yet: Hindi (2.20%), Vietnamese (1.48%), Thai (0.70%), Malay (0.66%), Bengali (0.47%), Tagalog (0.39%), Marathi (0.37%), Urdu (0.32%), …. They have at least 77 localizations, we only do 48, and because we’re a free software project, we don’t work down the language popularity list from the top :-) If we ever managed to do all the localizations they have, our additional localizations for smaller languages would give us 0.75 percentage points on them.
- If Microsoft were reading, which language should they do next?
- Balochi (0.51%), an Asian language spoken in Iran, Pakistan and Afghanistan.