Almost two years ago, I wrote an article entitled, "Copy an Entire Web Site with HTTRACK." That article is still available at http://blog.eogn.com/eastmans_online_genealogy/2005/12/copy_an_entire_.html. It describes the operation of a Windows program that can download an entire web site for offline browsing or for backup purposes. It is a good method of backing up your web site.
Now a similar utility is available for Macintosh users. Best of all, it is a free program. The author does accept donations, however.
SiteSucker automatically downloads Web sites from the Internet. It does this by copying the site's Web pages, images, backgrounds, movies, and other files to your local hard drive. Just enter a URL (Uniform Resource Locator), press the Enter key, and SiteSucker can download an entire Web site.
NOTE: Programs such as SiteSucker or HTTRACK are excellent for downloading static web pages. That is, web pages that never change. However, they will not work in interactive sites, such as those that query online databases. Don't try to download www.FamilySearch.org or eBay, even if you do have the disk space available.
You can use SiteSucker to make local copies of your Web sites for easy maintenance. It can either download files unmodified or "localize" the files it downloads, allowing you to browse a site offline.
If SiteSucker is in the middle of a download when you choose the Save command, SiteSucker will pause the download and save its status with the document. When you open the document later, you can restart the download from where it left off by pressing the Resume button.
SiteSucker is a Universal application, which means that it's made to run on both Intel- and PowerPC-based Mac computers. SiteSucker requires Mac OS X 10.4.x Tiger or greater. Of course, to download files, your computer will also need an Internet connection. If it is a large site, you will also need plenty of available disk space.
There are several limitations. As mentioned earlier, the program cannot query databases and will not work on any web site that asks for user input and then builds pages "on the fly," based on the input. That leaves out many genealogy databases, as well as eBay and others.
SiteSucker totally ignores JavaScript. It will not see any link specified within JavaScript. (If the Log Warnings option is on in the download settings, SiteSucker will include a warning in the log file for any page that uses JavaScript.)
SiteSucker does scan Flash (SWF) files for embedded plain text links, but it can only detect links to files that have one of the following extensions: html, swf, mp3, sit, zip, mov, gif, jpg, png, doc, or txt. SiteSucker cannot localize Flash files, and it does not examine other media files for embedded links.
By default, SiteSucker honors robots.txt exclusions and the Robots META tag. Therefore, it will not download any directories or pages disallowed by robot exclusions. However, you can override this behavior with the Ignore Robot Exclusions setting that's under the Advanced tab in the download settings.
The free SiteSucker program is available at http://www.sitesucker.us/.
