SiteSucker Automatically Downloads Complete Web Sites

SiteSucker is a low-cost, easy-to-use Macintosh application that automatically downloads complete Web sites from the Internet. I recently switched hosting services for this newsletter and am now using SiteSucker to make frequent backups of the new site. If you own a web site, you might want to do the same.

SiteSucker can be used to copy any web site to your hard drive, not only a site that you own, but also any other web site that is visible to the public without passwords required. I wouldn’t try to back up all of FamilySearch.org, however. That probably would require a lot more disk space than you own!

SiteSucker can be used to make local copies of your Web sites for easy maintenance. SiteSucker can either download files unmodified, or it can “localize” the files it downloads, allowing you to browse a site off-line.

SiteSucker copies the web site’s HTML documents, images, backgrounds, movies, and other files to your local hard drive. Just enter a URL and click a button and SiteSucker can download the entire site. It does this by copying the site’s HTML documents, images, backgrounds, movies, and other files to your local hard drive.

Once the entire web site is copied to your hard drive, you can use Finder to go to the stored copy and click on INDEX.HTML or whatever file name is used for the site’s home page. The web site will open in your preferred browser and you can navigate from page to page in the same manner you would online. The only difference is that you are viewing pages stored on your local hard drive, not retrieving pages from the web.

In order to save space on my computer’s internal drive, I have SiteSucker configured to save all pages to a SiteSucker folder in an external 4 terabyte hard drive that is plugged into one of my Mac’s USB connectors. You may prefer a different location for storing images, however.

SiteSucker backs up all static web pages but will not retrieve dynamic web pages.

NOTE: Dynamic web pages are those created at the moment a user requests information. For instance, a visitor to Ancestry.com or MyHeritage.com might enter a query for an ancestor named John Jacob Jingleheimer Schmidt. The web site then creates a BRAND NEW web page at that moment and displays it to the user. The web page is then deleted from the web site as it is no longer needed. SiteSucker cannot create queries so it will not ask for new web pages. However, it will copy all pages that are not database-driven, such as the static web pages at http://www.eogn.com.

SiteSucker also ignores JavaScript.

SiteSucker is available for $5 in the Macintosh App Store at https://itunes.apple.com/us/app/sitesucker/id442168834?mt=12.

SiteSucker is also available for your iPhone, iPad, and iPod Touch. Details may be found at http://www.sitesucker.us.

 

7 Comments

Your last few words reminded me I wanted to ask a question about Java. For years I thought Java was something I should always have (for reasons I don’t remember). Lately my new, Free Version of Avast! maleware protection disables Java when it cleans what Avast! calls “Grime.” Do I need Java? If I do, what for? I hear it isn’t doing well in the security realms these days.

Like

    Hundreds of thousands of web sites use Java to display information to you. Much of this web site is written in Java. The form you just used to write the comment is written in Java. Java is about the same as any other programming language: it can be used to create good things or bad things, as the programmer desires.

    Like

Hmmm…… Then I wonder what part of Java was “disabled”? Is there something unnecessary about it–such as, I wouldn’t need a suped-up version or something?

Like

Lynna Kay Shuffield May 4, 2014 at 12:53 pm

I do not have a Mac … Is there something similar for those of us who use a PC??

Like

Is there any way to prohibit this action against my website? I would not want someone to “suck” all my data and then turn around and sell it or use for commercial purposes?

Like

    —> Is there any way to prohibit this action against my website?

    Yes. SiteSucker and almost all similar programs and all search engines honor the terms you specify in the ROBOTS.TXT file you place on your web site. Web site owners use the robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. For more information, go to Google.com and search on “ROBOTS.TXT.”

    Like

Is there a way to prevent the links to reply to comments from being indexed by SiteSucker? It just ends up hanging on any blog post with comments with endless URL parameters.

Like

Leave a Reply

Name and email address are required. Your email address will not be published.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

You may use these HTML tags and attributes:

<a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <strike> <strong> 

Follow

Get every new post delivered to your Inbox.

Join 7,340 other followers

%d bloggers like this: