Location Services and Privacy

Mozilla is building a location service. This is a server to which you can send details of your (radio frequency) environment and it will respond with its best guess at where you are. The advantage of this is that it’s much quicker and more power efficient than spinning up GPS hardware (which takes a minimum of 30 seconds from cold). It can be fairly accurate; and even if it’s not, a rough location is much better than no location. For example, it lets you get a rough set of driving directions, and then reroute when your exact location is found. Due to AGPS, it also speeds up your GPS lock acquisition, getting you to an exact position faster. Lastly, GPS doesn’t work indoors, whereas this does. Google and other providers (such as Skyhook Wireless) already have such a service, and it’s built into e.g. the Android platform.

Such a service depends on having good data about what is being transmitted where, with what identifiers – mobile networks and WiFi, and maybe even Bluetooth. Our data set is created, at the moment, by people running MozStumbler, which records the RF environment, linked to actual positions obtained from GPS. Download, install and use the app today :-)

Now, as Mozilla, our initial impulse as an open organization would be to release all the raw collected data to the public so people can build awesome things we haven’t even thought of yet. However, it turns out that this data comes with some interesting privacy challenges. And I don’t only mean the privacy of the person stumbling or the person retrieving their location, I mean the privacy of the owners of the transmitting stations (e.g. WiFi access points or mobile phones acting as hotspots).

For example, let’s say someone moves a long way and doesn’t want to be found in their new location (for example, because they are escaping some sort of violence). If they take their wireless access point with them (and many non-technically-minded people might not think that was at all dangerous) then as soon as a stumbler drove by, a public database of raw data would reveal their new location to those who should not know it. Raw data may also contain the location of mobile phone hotspots (and therefore their owners). Other scenarios can be found, for the interested, in the bug report and security forum discussion.

The only way we know of so far to solve this is to tie bits of data together, such that you can only get a location when you submit, as part of your request, the IDs of two or more transmitting sources which the database already knows are close to each other – which means that you must be at that location. This is what Google’s location service does. The disadvantage of this is that if you are in an area with very little RF transmission around – e.g. just one access point, or just a mobile phone signal – the service can’t help you. The team experimented with hashing schemes to try and encode this restriction into a published data set, but we were unable to come up with anything workable. It means that basically, the data needs to be hidden behind an API which enforces this – which means we can’t publish the raw data.

There are other groups who are producing entirely open data of this type; it would be interesting to hear their views on such privacy questions. It would be good if we could make our data available to people who were willing to respect privacy constraints and encode those restrictions in their servers, and hopefully one day we’ll be able to do that. But at the moment, we can’t see any way to make the data set completely public :-( Interesting mathematical suggestions welcome…

Ubuntu Full Disk Encryption

Dear Internet,

If you search for “Ubuntu Full Disk Encryption” the first DuckDuckGo hit, titled “Community Ubuntu Documentation”, says: “This article is incomplete, and needs to be expanded”, “Please refer to EncryptedFilesystems for further documentation”, and “WARNING! We use the cryptoloop module in this howto. This module has well-known weaknesses.” Hardly inspiring. The rest of the docs are a maze of outdated and locked-down wiki pages that I can’t fix.

What all of them fail to state is that, as of 12.10, it’s now a simple checkbox in the installer. So I hope this blog post will get some search engine juice so that more people don’t spend hours working out how to do it.


GSoC 2013 Successes

We’ve wrapped up another GSoC, with 20 of 21 projects passing – our highest pass percentage ever. Not all students emailed me the URL to their wrap-up status report (you might find some more by following the links in the original announcement) but I know that we have:

Which is a pretty awesome set of achievements. Well done to all the students, and many thanks to all their mentors.

I’m also pleased to announce that Florian Quèze, who has been administering the program alongside me this year, will be in the driving seat for next year’s GSoC – which will be the 10th anniversary edition. Wish him luck! :-)

Watching European Parliament (EPTV) Video Streams on Linux

(This is a blog post I wished someone had already written so I could have found it quickly yesterday.)

The European Parliament TV site’s streaming system, used on pages such as this one, triggers the Totem plugin on my Ubuntu, which promptly crashes. :-| I’m fairly sure it uses some Windows codecs.

There is a mobile site for their main TV channels (video and audio streaming) which works a bit better – you can extract the rtsp stream from the links, and pass it to VLC. But if the session you want isn’t on TV, you’re out of luck. (Worse is if part of it is, and then it switches to something irrelevant in the middle!)

Fortunately, André Loconte has written a script which finds the right URL in such pages and passes it to mplayer with the right options. Thanks to him. Note that you may have to try various values for the “language” constant defined near the top until you find the one which is right for you. (It seems they do not allocate languages to channels consistently.)

Are We Meeting Yet?

Tracking Mozilla meetings across time is a thorny problem. People send out meeting reminder emails, but often they forget to update a URL or a date or a time, or they update it wrong due to a DST change, or they update the local time but not the UTC time, or vice versa. Wouldn’t it be awesome if there was one URL which pointed you at the next occurrence of a weekly meeting?

Enter Dirkjan Ochtman’s “Are We Meeting Yet?“.

Construct a URL like this:


Pretty obviously, the first bit is the timezone (not sure why it’s half-a-timezone), the second is the date of the first instance, the third is the time of day, the fourth says “repeat weekly”, and the fifth is the title. Visit the URL and see what sort of display you get. Note that it gives you the time of the upcoming weekly meeting even though the time directly encoded in the URL is from two weeks ago. That’s the “weekly repeat” part.

The upshot of this is that you can construct one URL for your meeting (defined in the timezone in which the meeting is fixed, which is Pacific Time for most but not all Mozilla meetings) and put it in your reminder email, and you’ll not have to change it unless the meeting time actually changes :-)

People who click the link can see the time in the timezone of definition, the time in UTC, and the time in their local timezone, all in one link. No more arguments about which form of the time is more important!

It would be nice to have a form to construct such URLs with a timezone picker, a datepicker and a timepicker. Anyone feel like coding one up? (HTML5 input controls would make such a thing easy to write, but sadly we haven’t implemented them yet :-( )

TEMPORAl Distortion

The UK’s General Communications Headquarters (GCHQ) has a system called TEMPORA. TEMPORA is the signals intelligence community’s first “full-take” Internet buffer that doesn’t care about content type and pays only marginal attention to the Human Rights Act. It snarfs everything, in a rolling buffer to allow retroactive investigation without missing a single bit. Right now the buffer can hold three days of traffic, but that’s being improved. Three days may not sound like much, but remember that that’s not metadata. “Full-take” means it doesn’t miss anything, and ingests the entirety of each circuit’s capacity. If you send a single ICMP packet 5 and it routes through the UK, we get it. If you download something and the CDN (Content Delivery Network) happens to serve from the UK, we get it. If your sick daughter’s medical records get processed at a London call center … well, you get the idea. … As a general rule, so long as you have any choice at all, you should never route through or peer with the UK under any circumstances. Their fibers are radioactive, and even the Queen’s selfies to the pool boy get logged.


Cheaper Data Centre Power

Evening out demand to match supply is a big problem in the energy industry. One way is differential pricing – charge more at peak times. Another way is energy storage, so you can supply more at peak times. One method of storage is to store the energy using compressed air. The disadvantage is that compressing the air gives off heat, and re-expanding it requires heat. Not supplying heat during re-expansion makes the storage much less efficient.

So which industry has large and fairly constant power costs it would like to reduce by buying some of that power at off-peak prices, equipment which can be placed anywhere, including where land prices are cheap, plus a lot of waste heat it doesn’t know what to do with?

Someone should come up with a gas compression power storage system for data centres.

Cost of power, ballpark: $0.06 per kWH peak (say 8am – 6pm), $0.03 off-peak
Power consumption of server: 1KW
Current cost of power per server per day: $1.02
Potential saving per server per year if all power used were priced at the off-peak price: $110

One issue is that a 5l bottle can store 500kJ of energy (0.14 kwH), so you’d need 500l per server. That’s a lot, so you may not be able to get the full saving unless you use an underground cavern rather than surface storage tanks. But if land is cheap, you could make a dent in your power bill.

Believe it or not, this post was half-written before I saw this Slashdot article.

Bugzilla API 1.3 Released

I am proud to announce the release of version 1.3 of the Bugzilla REST API. This maintenance release has a bug fix or two, and fully supports the version of Bugzilla 4.2 which has just been deployed on bugzilla.mozilla.org. For smooth interaction with BMO, you should be using this version.

The installation of BzAPI 1.1 on api-dev.bugzilla.mozilla.org will go away in 4 weeks, on 4th April. There is a limit to the number of old versions we can support on the server, particularly as the older ones can put a larger load on Bugzilla and may not work correctly. Please use either the /1.3 or the /latest endpoints. Now that BzAPI has been stable for some time, tools which earlier rejected using the /latest endpoint may want to reconsider.

File bugs | Feedback and discussion

No One Considered…

Micire’s talk was an excellent example of what can happen when a device maker doesn’t lock down its device. It seems likely that no one at Google or Samsung considered the possibility of the Nexus S being used to control space robots when they built that phone. But because they didn’t lock it down, someone else did consider it—and then went out and actually made it happen.

LWN (an awesome publication; do subscribe)

Bugzilla API 1.2 Released

I am proud to announce the release of version 1.2 of the Bugzilla REST API. This maintenance release has a bug fix or two, and some features useful to the admins of Bugzillas which BzAPI is pointed at.

The installation of BzAPI 1.0 on api-dev.bugzilla.mozilla.org will go away in 4 weeks, on 19th December. There is a limit to the number of old versions we can support on the server, particularly as the older ones can put a larger load on Bugzilla. Please use either the /1.2 or the /latest endpoints. Now that BzAPI has been stable for some time, tools which earlier rejected using the /latest endpoint may want to reconsider.

File bugs | Feedback and discussion

Google Calendar, and Meetings in UTC: The ‘Rekjavik Trick’

Google Calendar is great; I’m a big fan. A little while back, it acquired timezone support for events. More recently, it acquired split timezone support (start and end in different timezones), which is awesome for flights. And there’s a drop-down list of all the countries in the world with all of their applicable timezones. Surely that must be comprehensive, right?

Well, yes and no. I attend one meeting which is scheduled in UTC. There seems to be no entry in the massive timezone list for this. If you say you are in London (GMT+00:00), then your event will obey the UK DST rules, which means it won’t actually be in UTC during the summer.

However, there is a workaround. There is one country in the world which uses UTC and no DST – Iceland. So, if you want to have a meeting whose time is set year-round in UTC, then tell Google Calendar you are holding it in Rekjavik.

(It would be nice if Google would add an explicity “UTC” option to their massive timezones list, but this will do for now.)