The BREACH attack on SSL was recently disclosed at Black Hat. It depends on an attacker being able to notice a small difference in response size between two very similar requests.
Therefore, various mitigation methods which have been proposed involve attempting to disguise the length of a response. Some have said that this is mathematically infeasible, in that there is no way of disguising it enough without destroying the compression performance to such a degree that you might as well just switch compression off. I am not qualified to assess that claim, but I did have an idea as to how to add arbitrary length to the response in a transparent way.
My suggestion to vary the length is to add some non-determinism to gzip on the server side. This has the significant advantage of being transparent both to the application being served and to the client – it requires no changes to either. This makes it much easier to deploy than mechanisms which pad the content itself, or add data which needs to be explicitly ignored on the client. A quick review of the algorithm suggests that there are several places where you could introduce non-determinism.
The gzip algorithm either reproduces a string of bytes literally or, if that string has occurred before, replaces it with a pointer-length pair to a previously-occurring string of bytes. I suggest that the way that adds most variability to the output length would be to have gzip either not shorten a string, or shorten it to less than the maximum possible shortening, on some small percentage of occasions when it would otherwise do a full compression. The question is whether one can add enough length non-determinism to make the leaky length difference hard to detect without many repetitions and statistical analysis (which would drastically slow down the attack), without destroying the compression performance.
This scheme can be used to add arbitrary amounts of additional length (up to the original size of the file) in arbitrary places and patterns, dependent on static or variable factors.
As determinism may be an expected feature of gzip in some contexts, it would probably need to be made an option (which mod_gzip and similar libraries would invoke).
As an opening (and probably terrible) proposal for a static random length increase, I suggest that for each replaced sequence in the file, gzip decides either to replace it fully with a pointer+length, or to do so for all the chars bar the last one (emitting that char as a literal). It should do this with a probability of 1/N+1 for the Nth replacement in the file. This front-loads the non-determinism, producing an exponential decay in the likelihood of the sub-optimal choice being chosen, and meaning that the deleterious effect on large files is not overly large but small files still get a significant length variation.
This method might also work as a mitigation for CRIME.
Any sort of additive randomness like that will not work. The attacker just needs to repeat each request enough times to average out the noise. The distribution will still be centered around a value that leaks information. You need to not compress secrets, which is somewhat difficult for applications which use server-side templating because you do actually want the bulk of that compressed.
Or Blackberry could just release the patents to Elliptic Curve Cryptography and we can all switch to it.
Why should the distribution necessarily be centred around the true value? If you can vary the amount of randomness in an unknown but non-random way (e.g. based on a fixed secret which changes once per day)…
I agree that to make this attack utterly impossible, you need to not compress your secrets. But that’s hard without a lot of reengineering. If we can instead make the attack infeasible, that would be OK.