« 1880 History of Parke County, Indiana is Available Once Again | Main | Footnote.com Adds More Than One Million New Images »

April 27, 2009

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Cheryl Rothwell

I did all my paper several years ago along with many but far from all the books. The pictures go slower. I wrote a series of blogs about it starting here:
http://genealogysleuth.blogspot.com/2007/12/swimming-in-paper.html. I have Evidence Explained and several other important books in pdf. I hope more become available that way. I still prefer books but the reality is I cannot store them.

Jason Presley

Something you didn't mention this time around is what you're using for the OCR job. What is it and how well is it working? I'd imagine it should do pretty well with all that printed material, but I thought it worth asking.

Dick Eastman

At this time, I am using the OCR software that came with the Neat scanner for Macintosh. It seems to work rather well.

- Dick Eastman

Martha Rapp

could you build something like this instead of cutting the books: http://www.instructables.com/id/DIY-High-Speed-Book-Scanner-from-Trash-and-Cheap-C/

Becky Wiseman

Last year I scanned all of my genealogy documents and files - 30,000 pages in four months using a Fujitsu ScanSnap S300. It's awesome and the OCR that it comes with does a pretty good job on the pdf files. I wrote several blog posts about that project:
http://kinexxions.blogspot.com/search/label/scanning

I've taken apart several books to scan them and you're right - it is a terribly emotional to destroy a perfectly good book! Throwing them away is really hard too (mine go into the recycling bin though and not the trash). It's much easier, emotionally, to take apart magazines than it is books.

John Ralls

You own the whole house, right? So why are you limited to two bookcases in a spare room? The whole house should be crammed with bookcases! Pam's a genealogist, too, so she won't object!

Now, you're a handy guy. Last week you published http://blog.eogn.com/eastmans_online_genealogy/2009/04/build-a-highspeed-book-scanner-from-trash-and-cheap-cameras.html Why are you cutting the books apart instead of building one yourself?

John Ralls

Taneya

I would like to know how you organize your digital book collection?

Robert

Many libraries have "Friends of the Library" groups that organize book sales of both library discards and donated books, to raise funds for the library. Many thrift stores also accept books.

Either would be better than throwing out the books you've been able to download copies of.

I agree with the other comment though - build a proper books scanner (with automated page turning!), set it up in the basement, and you don't end up destroying the books. Plus you get to make money selling a "how to make a book scanner" guide - ebook of course.

For your backups, also consider exchanging backup disk or downloads with relatives in another country at regular intervals. Apart from the online one (at the whim of a commercial company) all your backups seem to be in one geographic area.

Carole Riley

Yes, I am horrified as well by the taking apart and then throwing out of perfectly good books, some of which might be welcome in someone else's personal library, if not public or family history libraries.

There must be a better way than to take them apart and then throw them away, even if the paper is recycled.

Amy

Just reading about you slicing up the books for scanning makes me cringe! I couldn't do it myself. But I do understand that you get a much better image with a flat copy and aren't hindered by the binding. What about taking digital photographs of the pages?

Dae Powell

Yes! Becky Wiseman's blog has it all! A wonderful odyssey of organization and efficiency. I'm inspired to return to MY scanning project. BTW, Miriam Midkiff has a monthly scanning session, which you might look into for some company and interesting chat. I've written a wee presentation that you can find on my site under Presentations / Methodology, too.

Happy Dae·
http://ShoeStringGenealogy.com

Roger Parker

Hi Dick - You should also try http://www.archive.org/index.php for scanned books. They have many old texts that may not be on Google Books.

Cheers, Roger

Patrice Houck Schadt

I am shocked that you would throw books out that you have not torn apart. Why not donate them to a genealogical society so that they can either be used or sold to support the society. I have give copies of magazines to my local family history center and they have been given out to new genealogists. Just because you no longer value a book you paid $150 for doesn't mean someone else has no use for it.

Patricia

This is so like me, but I don't think I could bring myself to dismantle my books or magazines. I have been debating if I should discontinue some of my magazine subscriptions, and take the time to thoroughly read the ones I already have. I am such a hoarder, and it is rather a painful disease. My computer room is supposed to be a bedroom, a small one, and I also have a knitting machine, from the days when that was my passion, taking up valuable space. I kid myself that I really should knit another garment from the wool that my husband banished to the roof space and we have some moderately heated discussions as to whether it should be brought down. I always lose the argument and I can't climb up there anyway, besides I am too busy with my computer. Wonder if I ccould fit another bookcase in here? mmmmm?

John Carter

I use a Fujitsu Scansnap also; it comes with a full version of Adobe Acrobat for the OCR and indexing. Recently scanned over 150 books our society publishes for sale. Nice to instantly search them all for a name or place. Ultimate goal is reduce resale inventory onhand through print on demand, or maybe sell an electronic version.
At home, I am scanning my genealogical research plus other records such as income taxes. A broken water pipe last week ruined a lot of my paper records, but I felt better knowing much of it had been digitized.

Martin Tolley

Dick,
Cutting up books is common amongst a certain type of person (Charles Darwin reputedly did it so large volumes were easier to hold). Other more extreme types have been involved in burning them. And we prefer not to talk about that sort of person.
But your logic is flawed. John Ralls is on the right track here. Why go with this self-imposed regulation? Buy a bigger house or digitize? Option three is to buy ANOTHER house. Then you can have as many books around in whatever rooms you like. And if your partner doesn't like that . . . well you do have another house!

Dick Eastman

---> I would like to know how you organize your digital book collection?

So far, it is a very simple system. I created a new folder under the "Documents" folder called "Scanned Documents." (if I used Windows it would be under "My Documents"). Under that, I made several folders. The two of genealogy interest are: "Books" and "Magazines." I have other folders for scanned receipts, some old letters, the title documents for my autos, some medical documents, etc.

In the Books folder, I create new folders for each scanned book and the name of the folder is the same as that of the book.

In the Magazines folder, I make new sub-folders for each subscription ("Family Chronicles," "NEHGR Register," "Evertons Genealogical Helper," etc.). Then each scanned magazine is stored with a file name that describes the date of the edition, such as "2009 Spring" or "2008-May-June" or something similar.

That's very simplistic and, so far, has worked well. I may need to improve it when I get a few thousand books and magazines scanned.

- Dick

Linda

This is inspiring. I've had no problem with digitizing and purging loose genealogy files, home files, downloaded books and now get only digital magazines. However, I've only looked at our bookshelves and wondered whether I wanted to tackle scanning 100's of pages. Taking them apart would be alot easier. Now if I can only convince my husband to downsize his collection of paper and books!!!

Edward Comer

Ouch! - Cutting apart books is painful to me. Instead, donate the books after first scanning them whole with the book scanner that you previously published info about on April 23, 2009. Don't you read your own blog?
See:
http://tinyurl.com/d6zafl

Dick Eastman

---> after first scanning them whole with the book scanner that you previously published info about on April 23, 2009. Don't you read your own blog?

Yes, but I lack the mechanical skills to make my own book scanner as shown in the recent article and I don't have the $125,000+ required to purchase a Kirtas book scanner that I wrote about a couple of years ago or the $10,000 required for the BookDrive Pro that I wrote about a few weeks ago (see http://blog.eogn.com/eastmans_online_genealogy/2009/03/bookdrive-pro-a-cradle-scanner.html?cid=6a00d8341c767353ef01156eb6a41e970c ).

I almost purchased a Plustek book scanner that I wrote about a few weeks ago (see http://blog.eogn.com/eastmans_online_genealogy/2009/03/soupedup-scanner-reads-books-aloud.html ) but scanning a single book of a few hundred pages is a tedious process with that scanner. Scanning all my books and magazines with the Plustek would require thousands of hours. I do expect to soon purchase a faster scanner than what I have now but it will probably have a built-in sheet feeder.

- Dick Eastman

Dave

I suppose digitizing sets the stage for the real benefit which is expediting retrieval. How do you organize the new "library" so you can retrieve the information quickly?

Chris Pomery

Dick,
The step we need is for the magazines to offer us a seperate online archive subscription, or next best a combined subscription to a print version and online archives. Personally, I would happily subscribe to more magazines if they gave me cut price access to their archives. This is easy money for them and any publication which is not doing this today is well behind the curve. Even my bank has realised it can cut costs by not sending me statements by post, yet I'm still getting magazines, some quite glossy, by post. I'm shaking my head here in disbelief.

I had to downsize recently and have found new owners for hundreds of books, videos and CDs via Amazon. Everything in my home is being digitized except the books, as these will be available within 10-15 years anyway and doing them eats into too much of my childplay, research and earning time. I've saved an entire room full of space, which prior to autumn 2008 would have cost me about £80,000 to add if I'd bought a bigger house. A no brainer really!

Cheryl Rothwell

I use Adobe Acrobat Pro which can convert the pdf to word searchable documents.

I do think the better solution is to have more books available electronically. I note that many of the old histories [public domain] I had as books are now available online at various places.

Sue

Dick,
In mylife, scanners have gone the way of the dodo..:-)
We started digitizing all our genealogical material years ago.
WHY cut a book apart and scan it when the digitized copy doesn't harm the book, which can then be donated to a library or sold online? The digitized copies of many books are then, put on a CD with backs up, of course on 2 external hard drives.
We have over 100,000 images done this way. To solve the problem of the diminishing lines of a photo with the tripod, (not to mention the damage to the spine in some cases) my husband built a handy dandy stand that is lightweight, comes apart to fit in his laptop computer bag along with his camera and will raise and lower to different paper size requirements, up to newspaper size and adjusts to the lens settings. Our camera is the kind that comes with a computer program that allows him to take the picture with the click of the key on the computer, lets you see the picture on the computer and it stores immediately in there without the hassle of camera card with all the different storage sizes and extra cost. We can take up to 400 images in several hours. The only time consuming part of the whole thing is when I need to crop or edit the image.
I'm sure others out there have or are doing the same thing.

Linda

I think the distinct advantage of tearing apart and scanning vs. photographing is that you can make a text searchable book with scanning and can't with photographing. I find searchable digital books far more valuable for the research process especially with those books that do not have detailed indexes. It would be just as valuable or even more valuable to donate a book no longer under copyright to a library or genealogy library than a book.

Terri Nelson

I also want to recommend searching Internet Text Archive (http://www.archive.org/) for pre-1923 books. To date, 5,552 items have been contributed by Allen County Public Library.

Holly Hendricks

I like neat receipts, but wish the categories were a bit more configurable. Do you have any tips? I felt like I was putting square pegs into round holes when I tried to categorize historical documents.

I also use Google Books, but now start with Internet Archive to see if there is a copy I can easily annotate. Quite often there is a non-Google copy there that I can import into PDF-Tools 4 and easily create personalized tables of contents, sticky notes, and so forth. I'll be quite willing to part with hard copies as long as I have an annotated copy.

Holly Hendricks

When I finished reading all the comments, I was surprised by how many people had such negative responses to cutting up books. I admit to having 27 bookcases that constantly need to be downsized, and would be happy to cut up and recycle 80% of them if I had online access to the information.

In library school we talked about books as containers, and book value. Many books are only important for the information they contain; whether they are available on a shelf or a thumb drive or server is immaterial.

Some important kinds of value are intrinsic, associational, and informational. A rare volume that is special because of its beauty, construction, binding, printing or plates has intrinsic value. (I may have 15 in all, if that.)

Associational value comes from importance of a prior owner, the annotations of an important scholar, a special inscription, or a connection to a unique time and place (yearbook, exhibition catalog). (I probably have 30 of these, half of which I keep for my church.)

If I pull out a few reference books, and some books with primarily visual and portability value (woodworking, quilting, gardening, some local history), that leaves about 2000 in my house that are simply containers for information. I like the idea of starting to ocr them.

I don't know if it would ever be challenged, but if I scan an entire copyrighted book and then sell or give it away, I would be violating copyright. There is integrity in recycling in these cases.

Holly

Doug Little

For those who asked about organizing their genealogy documentation - I gave a talk some time ago to our local genealogy club and put the talk outline on the web at http://gendoc.wdlmal.net/basic-org It may give you some ideas.

Edel

Apart from the cringe factor of cutting up all my books (and I don't yet have your space problem, thank goodness!), I have another problem with online books - I can't read from the computer screen for long without my eyes getting blurry and weepy. I can only read about 10 minutes at a time before this happens. I end up printing lengthy articles and documents so I can read them, which creates a storage problem, as well as an ecological dilemma! I can read a book from cover to cover with little or no break. Has anyone else experienced this problem?

Dave S.

Linda said:
I think the distinct advantage of tearing apart and scanning vs. photographing is that you can make a text searchable book with scanning and can't with photographing.

There is a program called TopOCR that converts .jpg files to text.

Dave

Dick Eastman

---> I like neat receipts, but wish the categories were a bit more configurable. Do you have any tips?

Neat Receipts for Macintosh allows you to create as many categories as you wish. I haven't seen the Windows version so cannot comment on what it can or can not do.

- Dick Eastman

Michelle

I love having digital copies of books and have scanned many books myself and a lot of my genealogy papers. Still, I don't think I could bring myself to cut apart my precious books. I'm thinking of volunteering to do your digitalizing in exchange for the books, Dick. :) BTW, are you still entitled to have the digital copy if you throw away or give away the book? You can't prove that you own it anymore.

As to other comments about photographing books and stating that they're not searchable, there are free tools to switch images to PDFs. There are other tools to OCR images or pdfs made of "images". There's really no difference if you scan or photograph, it's just a matter of what your software will do with it.

Ben Franklin

I've been advocating the digital approach for a few years in a course I teach about Computerized Genealogy at Duke University because of the number of times I have spent hours looking for an obscure reference buried in one of the 10's of thousands of hardcopy pages that I have collected. After scanning and OCRing them, I installed Google Desktop, which allows me to search the entire collection.

This approach has also shaped some of my data formats and organization. I'm usually not searching for a book that has a reference to a county 50 pages away from the person's name, so I break the OCRed text into a separate file for each page.

As several people have mentioned, organization is key, of course, as is the ability to point back to the original - especially in the case of poorly-printed (old) and subsequently badly-recognized materials such as NEHGR and the like. This allows you to go back to the original when a key date was mangled in the process, for instance. This is also essential for a proper source citation, and last but not least, I usually want to know where I got the material, so I have it broken into directories based on repository.

Another advantage to this conversion is that you can cut-and-paste entire citations into your data.

ANOTHER advantage is that you can take the public domain books that Google Books and others have made available on the Internet and mine them to build context in your family history.

In addition, of course, Google Desktop also finds emails in my archive of 10's of thousands of THOSE, which is Very Handy!

... anyway, I have 15 hours of course materials on this subject. It is something that I have a passion for, and I'm rapidly closing in on 1M pages in my collection.

Eric

Dick,
I just recently started reading your blog. And my hat is off to you. I am planning on doing the same to my collection. However, since I am still relatively new to the genealogy scene, my collection is probably an embarrassment compared to yours. However, you mentioned there were a number of books that you had in your collection that were available thru Google Books. Could you (either here or in a future blog entry) list those books you found on Google Books that may be of interest to a novice (or even advanced) genealogist?? I would also be interested to know how you organized your digitized data (not just the books and magazines, because that seemed relatively straight forward).

Dick Eastman

---> Could you (either here or in a future blog entry) list those books you found on Google Books that may be of interest to a novice (or even advanced) genealogist??

Probably not. There must be tens of thousands of such books now available on Google Books: family histories, local histories, etc. I just went to http://books.google.com and searched for the word "genealogy." Google found 152,540 books. (Not all of them are available in full text.) That ignores the local histories and other books that do not have the word "genealogy" in the titles.

Making such a list would be a Herculean task and, once it was completed, it would already be out of date. A far simpler and more reliable method would be to go to http://books.google.com and click on SEARCH, then enter whatever topic(s) you are interested in.

- Dick Eastman

Roger Parker

Linda - I've photographed books and then digitised the images so that they are text searchable. It's not difficult. Obviously, the photo needs to be sharp but the OCRing can be done with software such as Adobe Acrobat or Omnipage.

The latest versions of Omnipage claim to cope very well with the curvature of the text that you can get near the binding on some books.

Roger

antje wilsch

If you want to spend the money, you can get the books copied and then scan the copies. Just a thought :)

Terry

How do you insure that the scanned versions of texts, which (after OCRing) are editable text documents, are protected from accidental or intentional alterations?

Carolyn Scott

I purchased a Neat-Receipts scanner early last year, with the idea of scanning the piles and piles, boxes and boxes of loose sheets of genealogy material. However, for my first project I scanned all the recipes my grandmother had handwritten on 5x8 recipe cards as 'documents,' which I then transferred to CDs. BTW, I saved the scanned documents to my hard disk as .jpg image files, not .pdf files. (As .jpg files, the cards are not searchable) With a complete index, a brief history of my grandmother's ancestry with a 1-page photo collage created by Google's PICASA, and a brief section about WWII rationing, I had Kinko's make up books for my children and grandchildren for Christmas. I've already decided, tho, that disposing of the original cards will be left to my descendants - I've been the keeper of her collection for over 50 years, and tossing them is not in MY future!
I did upgrade the software recently when the company issued an upgrade, and it did simplify the process a bit, it seems.
I do have a flat-bed scanner, too, which I use for photos, but, have tested the Neat scanner on some, saving them as .jpg files. They looked fine. This scanner cannot scan at a very high resolution, but if the photos are high resolution photos it does fine. And, you can choose black/white or color for any of the items you scan.

Renee Zamora

What are you using to search all of the digitized books at once - Google Desktop?

Also WorldVitalRecords.com has all of Everton's Genealogical Helper Magazines from 1947-2006 online.

Dick Eastman

---> What are you using to search all of the digitized books at once - Google Desktop?

No. I use the built-in search that is included with the Macintosh operating system. It works well.

- Dick Eastman

cartucho r4i

That's good thinking to converting library in to digital library that yo can make by updating all books on computer and from that you make on internet or you can directly get e-books from the internet.That is so useful and time saving thought by you.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Receive FREE daily newsletter updates by email

  • Enter your email address


    Click here to see a typical e-mail message you will receive.

    I promise that:

    1. I will never sell, rent, or give away your address to any outside party, ever;
    2. I will never send you any unrequested e-mail, besides newsletter updates; and
    3. All unsubscribe requests are honored immediately, period.

My Photo

Search This Site for Past Articles

Meet Dick Eastman in Person

November 2009

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          

Amazon Kindle

Offers

Blog powered by TypePad

Amazon Picks

Receive daily newsletter updates by email

  • Enter your Email


    Preview

    (Don't worry, I hate spam as much as you do and you will be able to UNSUBSCRIBE within seconds at any time!)