Do It Yourself Book Scanning

Saturday, October 10th, 2009 by Harry Lewis

I am just back from D is for Digitize, James Grimmelmann’s conference on the Google Books Settlement at the New York Law School, where he teaches. It was an excellent meeting, about which I will report more in a followup post. But there was one clear star in the day-and-a-half of panels, each with four or five speakers. The prize goes to Daniel Reetz of — that is, Do It Yourself Book Scanner. Reetz, a book freak and mechanical genius, figured out how to make a book scanner out of stuff you can find in dumpsters, or buy cheaply, including off-the-shelf, cheap digital camers. He has put the instructions online so we can all build our own. It’s amazing to see — Reetz demonstrated it, and then in just a few minutes folded it up and put it into a bag small enough to take on an airplane. It works fast — and it’s really well designed. The slowest part is turning the pages, which you do by hand.

This is the equivalent for books of the tape recorder for music and the VCR for movies. We can all digitize our own books and throw them away now.

Or make them publicly available. I said “can,” not “may” or “should.” But the existence of the device has the potential to raise lots of the same kinds of questions those other duplicating technologies raised. It empowers individuals, and enough empowered individuals could produce a Wikipedian digital library, collectively assembled, imperfect and incomplete, but growing and expanding.

While everyone else at the conference was ruminating about whether Google had a library monopoly or whether Amazon or Microsoft might imaginably be able to compete, along comes this dude with his Rube Goldberg contraption and says, hey, let’s all just start doing it, and we’ll catch up eventually.

Astonishing idea. At the conference, it took about thirty seconds for an author to ask Daniel why he should ever write another book, since the first person who bought it could instantly make it available to the entire world. Of course, Daniel replied something like what he has on the Web:

I¬†love books. There is some truly fantastic knowledge and information hidden out there in hard to find, rare, and not commercially viable books. I find that I want my books with me everywhere. But that’s where the problems begin. Buying, moving, storing, and preserving books means environmental costs… and when I loan a book to a friend, I no longer have access to it.

Digital books change the landscape . After suffering through scanning many of my old, rare, and government issue books, I decided to create a book scanner that anybody could make, for around $300. And that’s what this instructable is all about. A greener future with more books rather than fewer books. More access to information, rather than less access to information. And maybe, years from now, a reformed publishing/distribution model (but I’m not holding my breath…).

Check it out. And if Daniel comes to a show near you, go see him. He’s cool in the way many a game-changing techologist has been cool.

Added October 11: I’ve received two interesting pointers since posting the above. First, an account of how Google’s scanner works; and second, a pointer to Snapster, commercial software for using your digital camera as a cocument scanner. The point is that there are a lot of things that are possible in this space, in mechanical design, image analysis, and coding, and it’s going to be interesting to see if Reetz can build a open community around his scanner, contributing both engineering and content.

12 Responses to “Do It Yourself Book Scanning”

  1. Rebecca Says:

    “…when I loan a book to a friend, I no longer have access to it.” That’s how it’s supposed to work.

    I am a bookworm and have been waiting for the publishing industry to get e-books right. To me, that means one universal format with some kind of Utopian digital rights management that allows you to MOVE the file to any device you choose, yet does not let you COPY it, so that it will be protected from piracy. If we do not protect music, movies and literature from piracy, then we will be cheating the artists who create them. That will end with much less quality entertainment for all of us.

    Now, if these homemade copies truly are the equivalent of, say, homemade music recordings of the past, then there will still be people more than willing to pay for the official high quality versions of these works. Often, just like in music, people just want to “sample” a work and will still buy it if they like it. That’s fine, nothing wrong with that, because they would have done the same thing, just “sampling” it in a different manner. I’m just not sure that making a Wikipedia sized library of poorer quality books is where we should be focusing our energies right now when the whole e-book thing still has not been resolved to the consumers’ satisfaction.

    I want all my books with me, too. That’s why I finally gave in and recently ordered a Kindle. I’m going to start buying all of my books that are available in that format, and get a DocuPen scanner from for about $300 – which will easily fit in my bag – and scan all the books I already own, as well as those I buy new which are not available on the Kindle.

    BUT I will KEEP the original book. I paid for one copy, not two. If we all copied our books and gave the originals away, then authors will only get half the royalties they would have received if the person we gave the book to had bought their own. And, yes, I realize that many – if not most – of the people we’d be giving our books to would perhaps not bought their own copy. However, if any of them at all would have, that constitutes lost royalties to the author.

    And that’s wrong.

    So multiply that by how many? Copy a book and make it public? That’s piracy, plain and simple. And not only is it against the law – copyright violation, anyone? – it’s morally reprehensible.

    I love to read, and I deeply appreciate the talent of my favorite authors. They deserve to benefit monetarily from the great pleasure they provide people with their words. I would not steal that from them for the world.

  2. Harry Lewis Says:


    That all makes perfect sense, but I am puzzled about what’s legal and illegal, as opposed to right and wrong. Is it a violation of US copyright law for me to digitize a book I own? And if it is legal, does it become illegal the moment I give my book to someone else, which as you say has the effect of halving the number of copies sold, since we can both read it simultaneiously?

    In practice, the answer might depend on intent, timing, numbers of books scanned, and so on. But we shouldn’t count on common sense getting anyone off the hook on matters of copyright–otherwise we wouldn’t have damages of hundreds of thousands or even millions of dolars for kids downloaing a few dollars worth of music.

  3. Eric Hellman Says:

    I think it’s interesting that one of the DIY scanners talks about dumpster diving for books to digitize. In the non-digital world, the value of physical books does go rapidly to near zero- certainly you’ve been to a used-book sale and paid a quarter for a great book. What is our moral obligation to authors when we pay a quarter for a print book? And is our moral obligation any different if we digitize a book out of a dumpster?

    I think we should abandon the use of the word “piracy” when talking about book copyright. After all, real pirates were more about extortion than they were about theft. When you think about who’s aiming big guns at civilian shipping and demanding treasure in the book business, it’s not Dan Reetz! Maybe we can talk about book counterfeiting instead?

    My report from Reetz’s presentation is at

  4. Mister Blue Says:

    I just got a Scansnap S5100 which scans both side of the pages and has a fast 50 sheet duplex scanner.

    The only disadvantage is that you have to rip out the pages and therefore you loose your book once you are done scanning.

    On the other hand, you get a nice PDF on the other side which works great with my tablet pc.

  5. Benjamin Geer Says:

    Nearly all the books I read are academic texts, which, as is well-known, don’t make any money for the authors anyway. Moreover, they contain the results of research which, almost invariably, was funded by taxpayers’ money. So we’ve already paid for those books. And the authors are salaried academics, so they don’t desperately need whatever measly royalties they might be able to get from the sales of those books. As I see it, academic texts should therefore be public property.

    Should I ever manage to get a book published, the most important thing to me, as an academic, would be to have the book reach as many readers as possible. Including people in poor countries, where an academic book from the West costs about as much as a month’s salary for a person earning the average wage. It would therefore be in my interest, as well as in readers’ interest, to make the book freely available in electronic form.

  6. Benjamin Geer Says:

    Oh, and here’s another homemade book scanner, made out of Legos, that turns pages automatically.

  8. Orin Says:

    Benjamin, a lot more goes into an academic book than just the author’s ego. As much as the author of an academic work is rewarded by seeing their name in print, you can’t really say the same about the people who edit the text or sit there and fact check the damn thing. At the moment, when you buy a book, you make the assumption that someone has probably bothered to follow up on those pesky footnotes to see that the cited source (a) exists and (b) says what the author claims. As books become less renumerative (which seems inevitable) they *will* become less reliable.

  11. craig c Says:

    I believe Google have been using book scanners which read the distance of the pages, including the bend. So that when scanned they appear as flat images with little or no black depth marks on that often comes with book scanning. We usually carry out scanning using both ways. But the fastest way is always to slice the book and feed scan the pages if you are able to.

