Google Print, And Clue Barriers

It’s long been stated that if you put your images up on the web, there’s no real way of stopping people downloading them and using them for their own purposes. That’s still basically true, although one of the interesting things about the new “Google Print” service is the unusual lengths it goes to to prevent the average web user from doing exactly that.

Google Print allows you to search “printed books” (although Google obviously has the data in electronic form). Here’s a sample results page that you can play with as you follow along.

The first thing that prevents you from saving the JPEG of the printed page to disk is that right-click is disabled. They’ve used the standard JavaScript tricks (for Gecko, returning false from the oncontextmenu handler) to disable the context menu for the entire page. This is no problem for those taking back the web. Go to Tools | Options | Web Features | Advanced JavaScript and uncheck “Disable or replace context menus”. Score one for Firefox.

The next obstacle is that View Image on the newly-working context menu seems to show you a blank page. Actually, the printed page consists of a clear GIF <img> overlaying the actual page image, which is a CSS background on a container <div>. So if we just click View Image, all you end up with is a large transparent GIF. And because there’s a foreground image, Firefox suppresses View Background Image on the context menu, for reasons of usability and brevity.

OK, so let’s use the Media tab of the Page Info dialog. This lists all the media on the page, and has a “Save As…” button which allows you to save any media to disk. Except that it doesn’t – it currently works for images inserted using <img>, <input> and <embed>, but not backgrounds or <object>.

The next idea was to copy and paste the URL out of page source. However, Google likes to serve pages without newlines, and there are a lot of similar URLs in it, so this seemed like a pain to track down the right one by scrolling two and a quarter miles to the right. I did note, however, that they are using <style> inside the <body>. Tut tut.

So instead, we can try and delete the clear GIF from in front using the DOM Inspector. We inspect the URL, locate the GIF and press Delete. Bang! The entire image disappears! How did that happen? Well, the <img> was providing a size for the <div> – so when the <img> disappears, the <div> collapses. No problem: we manually edit the CSS style rules to give it a width and a height. This allows us to view the background image again. However, the DOM Inspector doesn’t support the content area context menu, so we still have no way of saving it!

Next idea: use the DOM Inspector to inspect the entire browser XUL. This means that the context menu will still work. It’s more difficult to do, because you can’t locate elements by clicking in the content area – it only works for the chrome. Still, we finally track down the clear GIF <img> and delete it. Boom! This time Firefox crashes (taking with it an earlier version of this blog post.) :-(

OK, let’s try another approach. Let’s find the surrounding <div> in the DOM Inspector, look at its computed style, and copy the URL out of it. Except that the Computed Style view doesn’t support copying. Undeterred, and feeling close to the goal, we view the applied styles for the <div> and try and copy the URL out of the individual background style rule.

Success! This works. We can chop off the CSS gubbins, paste the result into a web browser URL bar, and finally get an image we can save.

In fact, you can also get the URL of the page graphic by viewing the source. It turns out that it’s not as hard as I made out, because currently, the <div> in question has a sensible class name:

.theimg { background-image:url("") }

so it’s easy to find.

So what’s the point? Well, this is an example of what I call raising a clue-barrier. At my university, they didn’t have the resources to chase after everyone playing online games, but needed to prevent them from becoming a bandwidth problem. So, they blocked the port used for a popular online game’s “server discovery” mechanism. Those without clue fired up the client, tried to find servers, didn’t find any and gave up. However, if you bothered to research server IP addresses and type them into the client manually, you could play to your heart’s content. The clue barrier filtered out a large proportion of the population, thereby preventing bandwidth problems.

The key characteristics of a clue barrier are that it’s easy to put up, and it’s not perfect, but it’s good enough to solve the problem. Google’s techniques won’t prevent anyone technical from saving their images to disk, but it will prevent 99.99% of people (at least until someone writes a specialised Firefox extension). And for Google, that’s good enough.

29 thoughts on “Google Print, And Clue Barriers

  1. I would have thought a bookmarklet could be written to extract the url of the background image from the .theimg class and open that in a new window or the same window. No need for a full blown extension.

  2. For kicks, I tried removing the overlay GIF using the extension Nuke Anything (availabe from Ted’s Mozilla page ), but it removed the image of the book page instead.

  3. All this would have been much easier if you were using Mozilla Suite instead of Firefox. In the Suite the Media tab of the Page Info dialog works with background images.

    (Well ok, the Save buton doesn’t work but you can copy the URL.)

  4. Or you can just tap the Print Screen key and paste it into your image editor of choice (MS Paint works fine).

    Resize the the image and you’ve got yourself a copy of whatever you’re looking at.

  5. The Page Info Media tab worked for me in Firefox. Like irongut said, the save doesn’t work, but the url is right there for copy pleasure.

  6. As I have webdeveloper extension installed, I opened up the CSS editor sidebar. The rule for .theimg is handily in it’s own sidebar tab, so select and copy of the url is easy.

    As others have already said, the easiest way on a stock Firefox is to use the page info dialog box as you can select and copy from the “detail” section of the Media tab.

  7. Oakwine, that was because when you use Nuke Anything, it removes the image, and, as Gervase said, removing the image collapses the div, so thats why it looked like it removed the image.

  8. Having got to the image in the ‘Page Info’ window you don’t even have to copy and paste its URL: you can simply drag the image’s row in the listbox into a browser window (or on to an empty bit of the tab bar (such as the close button) to load it in a new tab).

  9. An even easier approch.
    Install the “Image Zoom” extension. Rightclick over the book image, click “Zoom out”, the right-click again over the book image and choose “View background image”.

    Can’t get much easier than that.

  10. It’s interesting you should mention this, Gerv. Last week I was discussing the exact same image-obfuscation technique, but on instead of Google’s new books function. Viewing the source is a very easy method once you find where it’s located, as the position is always generally the same.

  11. Gerv, is there a bug on Save As not working? Should be easy to fix..

    For the rest, if you use DOM inspector in the useful way (via ctrl-shift-i), you can just edit the page in it while using the browser chrome complete with context menus.

  12. I had an solution similar to Jed’s. Using the Mouse Gestures extension, a up-left diagonal stroke halves the dimensions of the transparent gif. The container seems to only obey changes in height, not width. Plenty of background image area for context-clickin’.

  13. I actually implemented a very similar function on a photographers website, except I added a referer function. Even though the user could do exactly what you did, when they tried downloading it, it would give them nothing, unless it was being pulled from the page it was meant to be placed in. I’m surprised google hasn’t done this too. Of course, all of this is really moot unless you have some cache control.

  14. Looks like google has read this and removed the functionality to search within the book. Earlier all errors, as in it said ‘come back in 30 seconds’. Now there’s no searching.

  15. Here’s a quick Firefox hack:

    1) Find browser.jar in Firefox’ installation directory under chrome/
    2) Extract the file “content/browser/browser.js” and edit.
    3) Find these lines:

    this.showItem( “context-viewbgimage”, !( this.inDirList || this.onImage || this.isTextSelected || this.onLink || this.onTextInput ) );
    this.showItem( “context-sep-viewbgimage”, !( this.inDirList || this.onImage || this.isTextSelected || this.onLink || this.onTextInput ) );

    and remove “|| this.onImage” from each one.

    4) Add it back to browser.jar (back it up first!)
    5) Restart Firefox and “View Background Image” will be available on the context menu, even when on an image.

    Also make sure “Disable or replace context menu” is unchecked in your Javascript preferences.

    I can probably make an extension that does this as well.

  16. Yeah, Ablock works very well here.

    If the image collapses, just uncheck “collapse blocked elements” in Adblock.

  17. There is an easy way around all these measures, just use packet capture applications like NetworkActiv’s file capture mode. That will capture all images and other files to the local folder specified.
    I’m not saying you should break the law and steal these books. This info is just to show the flaw in this technology.
    If you are a programmer, it is pretty easy to strip out and screen capture only what you need. I will see if I have time and post a link to some code…for educations purpose of course.

  18. In firefox a simpler way to view the image is to load the page you are after, then in another tab type “about:cache”.

    With three clicks you can easily find the appropriate image…

    Of course, this manual approach has to be redone every time you want to view an image.

  19. “…but it will prevent 99.99% of people…”

    But it is the 99.99% of people that Google don’t really have to worry about. Some people may just want to save one or two pages for a quick reference later on, and so what if they do? It�s the 0.01% of people that would actually take the effort to copy, and reproduce a book that they have to worry about, and there is absolutely no way to stop them. So, I think this whole protection system is a waste of time, and Google should spend their time on something useful, like fixing up the problems with the rest of their site.

  20. Well, here’s the simplest solution available (as far as I know at least):

    1) Install Adblock plugin – btw very useful for nasty ads.
    2) Set it to *hide*, not *remove* ads – Tools->Adblock->Preferences…->Hide ads.
    3) Right click (you have to disallow disabling of popup menus, as in blog entry) on the image, Adblock image and feel free to save any pages you want, You’ll be able to View Background Image and then save it.

  21. It�s the 0.01% of people that would actually take the effort to copy, and reproduce a book that they have to worry about, and there is absolutely no way to stop them.

    Copyright law has worked reasonably well up to now for this. :-)

    And no-one is going to distribute a book electronically as a large set of images, anyway. It’s not greppable, it’s not resizable, it’s useless. Better to buy the real thing.

  22. Excellent discussion. Google has obviously put a fair amount of effort into this copy protection scheme. It’s the best I’ve seen yet, a bit harder to break than say amazon’s search inside the book or photo sites. But still breakable.

    If there’s an “arms race” with Google, I’m curious to know what they will do next. Some possibilities: referrer checking, dynamic code obfuscation, splicing of images, creative uses of alpha channels to layer partial images, etc. That might take them up to 99.999% prevention until somebody writes a specialized XPI that looks at the IO coming from Google and not the page content.