NOTE: This article has nothing to do with genealogy. If you are looking for genealogy-related articles, you might want to skip this one. However, the article describes some very useful methods of storing information from web sites onto your local hard drive. That can be useful for genealogists and for many other people as well.
It is easy to save individual web pages for offline reading, but what if you want to download an entire website? Would you like to download part or all of a particular web site and store the information on your computer’s hard drive? There are several different programs for Windows and Macintosh that will let you do just that. Some of the programs are available free of charge.
Joel Lee has published an article in the Make Use Of web site that describes four such programs. He even describes my favorite one, called SiteSucker. He writes:
“If you’re on a Mac, your best option is SiteSucker. This simple tool rips entire websites and maintains the same overall structure, and includes all relevant media files too (e.g. images, PDFs, style sheets). It has a clean and easy-to-use interface that could not be easier to use: you literally paste in the website URL and press Enter.”
I have to agree with Joel Lee’s description. I have been using SiteSucker for years to make backup copies of my own web sites and have been pleased with its operation. I even wrote about SiteSucker in this newsletter in 2014 at http://bit.ly/2yZhcwZ.
SiteSucker costs $5, a modest amount considering how useful the program has been for me.
I have also tried another program he describes, called Wget. I wan’t too enthused about Wget and I noticed that Joel Lee didn’t say much about it either, other than to give a basic description of the program. He didn’t offer any opinion of how useful Wget is compared to the other three programs.
One thing that Joel Lee does not mention is that these programs download all static web pages but will not retrieve dynamic web pages.
NOTE: Dynamic web pages are those created at the moment a user requests information. For instance, a visitor to MyHeritage.com or Ancestry.com or FamilySearch.org might enter a query for an ancestor named John Jacob Jingleheimer Schmidt. The web site then creates a BRAND NEW web page at that moment and displays it to the user. The web page is then deleted from the web site as it is no longer needed.
SiteSucker cannot create queries so it will not ask for new web pages. You won’t be able to download the billions of records that are available on these large genealogy databases. That’s a good thing as you probably don’t own enough disk drives to hold all that data anyway. However, these programs will copy all pages that are not database-driven, such as the static web pages at https://www.eogn.com.
You can read Joel Lee’s article, How Do I Download an Entire Website for Offline Reading?, at http://bit.ly/2D1k2pJ.