How to Digitalize Your Textbooks

by Adam Vella on February 16, 2010

(originally posted on WebCrush.com — legal disclaimer: generally speaking, in the US, the copyright holder possesses the sole right to reproduce his or her copyrighted material.  By reproducing digital portions of your textbook, you may be committing a violation of The Copyright Act.)

Why Digitize Your Textbook?

Several people have asked me how I have my school textbooks in PDF format, so I thought it would be an appropriate topic to post up here. Usually the first question I get is WHY would you want your book in PDF format? I can think of several reasons why I prefer it:

  1. View or read material directly from your computer or mobile device
  2. It is very easy to search for a specific term or phrase
  3. You can bring to class just a printed chapter or subsection rather than an entire book
  4. If you ruin a page with writing, highlighting, etc–you can just print it out again

If any of the above interests you, then having creating PDFs from your textbook may be a good idea.

How Do I Do It?

Since virtually no legal publisher provides their books in digital format, we’re on our own to convert them. There are several ways to do this, some quicker, some cheaper–but I’ll cover them all.

In my opinion, the best way to convert your book to PDF is to simply let someone else do it. C’mon, its the American way! Take your book(s) down to a printing shop and they can do all the heavy lifting. Not all places will do this; some Kinkos will and some won’t, so call around.  Although it may be a Copyright Violation, Mom & Pop printing shops will usually do it. Note that pricing can vary substantially, so it probably makes sense to shop around. The place I use wanted $0.25 per scan but I talked them down to $0.13. Obviously not the cheapest, but if you have the coin for it, it’s worth it. I’ll also describe each of the steps if you’re doing this yourself.

  1. Chop Off the Binding
    Using special machinery, the shop will cut the binding off the book and make a clean even cut of the binding edge of the pages to remove any residue glue. If you are doing this yourself, I suggest you still have a pro perform this step–it only costs $5 and will save you a million headaches if the glue gums up a scanner.

  2. Scan in the page
    A printer will then take the pages and drop them into an auto-loading feeder for their scanning machines. It is important that they be able to provide duplex scanning, which is scanning of both sides of the page. Most do, but CYA and ask first. The auto-feeder is simply the same as the feeders found on office copy machines so they can speed through the stack of pages and create digital images.I have done this step on my own using my office’s multifunction copier, which served as a scanner as well. The feeders on these machines can be finicky at times, so try to limit the number of pages you load at a time to 100. This will help prevent jams. Also make sure the pages are free of any glue debris and don’t have any folder corners which can cause problems. This is obviously a much more cost effective solution, assuming you have access to such a machine and a few hours to burn.

    If you don’t have access to quality office equipment, duplex scanners with autofeeders are also available to the public for reasonable prices from companies such as Kodak, HP, Brothers, etc. They’ll run about $300-$400 but will pay for themselves in a year. Important to note that since these are not commercial machines, the autofeeder will handle less of a batch and will also scan much slower, increasing the overall time spent in the proces.

    The real cheapo option is to use a flatbed scanner and do it all by hand. If you have a week with nothing to do, sure.

    I suggest scanning in the table of contents, indexes, and other supplementary sections. They’re very helpful.

  3. Convert to PDF
    Once the pages are scanned in, converting to PDF is the easy part. Commercial scanners, like ones found at printing presses or at corporation often have the capability built in–the output from the scan IS in PDF format. Even consumer devices included this functionality and necessary software. Printers will provide you a single PDF file on a CD, which is about 200 megabytes for a 1000 page book.When you are doing this on your own, it may be best to give each chapter its own file as processing the files can eat up a lot of system memory. Even for the single massive files a printer can provide, I still break them down into smaller units to be more manageable.
  4. Optimize
    The final step is to optimize the PDF produced by the printing press or scanner. This step will require that you have a full version of Adobe Acrobat or other similar tools that can read/write PDF format. The first part is to run OCR (optical character recognition) which converts the images of the letters into actual text that can be highlighted, copied, or searched upon. To execute this in Adobe Acrobat Professional 7.0, from the menu select ‘Document–>Recognize Text Using OCR–>Start’. Select all the pages on the next screen and let the process finish. It is not quick and will take about 1-2 seconds per page.  If you are using Adobe 9.0, there is a new setting of OCR called ‘ClearScan’ which will product files much smaller and compact.  After optimizing, I have 600 page text books that are only 8mb in file size.


Run OCR in Adobe Acrobat
Once that is completed, you will then wish to run a PDF optimzation on the file which will decrease the file size where possible. This is also available in Adobe Acrobat by selecting ‘Advanced–>PDF Optimizer’ from the menu.

optimize.jpg

This will open a sub-window of options, but I just leave the defaults and let it do its magic. Once done you now have a PDF version of your textbook, searchable, printable, and highlightable.

Resale Value?

Now that you have your book full digital, what is left over of your book may make you want to cry considering what you spent on it. Fear not, people WILL still buy them on Ebay as long as indicate that the book is unbound. You never know, maybe someone else is looking to digitize the book as well.

Whether you choose to print out the entire book or keep it electronic only is your decision. I prefer to print them on 3-ring-punched paper and take a chapter or two to class with me. I also typically have a page or two on me for quick reading whenever I suspect I’ll be: stuck in traffic, waiting at a restaurant, bored at work, etc. Its much simpler than carrying around all 1000+ pages. I don’t need to be worried about folding up the pages or crumbling them into a bag–I can just print up another one when I want.

Last but not least–I’d advise avoiding trying to make a quick buck and selling copies of the digital books to other students. Lets be realistic here–do you really want to get on the bad side of the legal publishing industry by committing copyright infringement? I wouldn’t either.

On the flip-side though, if you and a bunch of friends wish to split the cost of having a printing company perform the above steps for you, as long as you each own a physical copy of the book, I can see no reason why this would be prohibited (but then again, I’m only a 1L, I haven’t taken ‘copyright’ yet).

I hope this helps you out.