Message boards on genealogy sites and blogs lit up this past week as Ancestry.com announced the new Internet Biographical Collection. The pros and cons have been discussed ad infinitum elsewhere, so I won't repeat them here. If you have not yet read about this controversy, perform a Google search on the words "Internet Biographical Collection."
Many of this week's discussions debated claims and counterclaims about copyrights, legalities and such. I read a lot of these messages but never found any written by anyone who claimed to have a law degree or other appropriate credentials. It seems a lot of people, including me, were writing about legalities without having the academic qualifications to back up their claims. To be blunt, I don't know if anyone was correct. I also noticed that nobody cited legal precedent, at least not with a case title and source citation.
Shame on all of us! We genealogists should know better than to make claims without source citations.
I have now found one case where a court ruled on the exact issue of the legality of caching other web sites' content and on the copyright laws involved. This landmark case should be required reading for all of us who posted messages either for or against the recent ill-fated Ancestry.com product.
NOTE: For the rest of this article, I am only looking at U.S. copyright issues. If you live outside the United States, laws and court actions may be different in your country.
In Field vs. Google, Blake A. Field objected to Google's caching of web pages from a site he created. Mr. Field is an attorney, so we can assume he has some expertise in copyright law. He at least sat through a number of classes on the topic when he attended law school, which is probably more than the rest of us who posted messages recently can say.
Mr. Field brought the copyright infringement lawsuit against Google after the search engine automatically copied and cached a story he posted on his own website. Mr. Field claimed that Google had violated his copyrights by caching his information and making it available elsewhere without asking for his permission. (Does that sound familiar?) Google responded that its Google Cache feature, which allows Google users to link to an archival copy of websites indexed by Google, does not violate copyright law.
How many cases can you cite in which this specific question on copyright law concerning web site caching has already been decided in Federal court?
On January 12, 2006, the Honorable Robert C. Jones, United States District Court Judge in Nevada, ruled that no copyright infringement had occurred, and that Blake A. Field was not entitled to damages. Specifically, the court granted summary judgment in favor of Google on four independent points:
- Serving a web page from the Google Cache does not constitute direct copyright infringement because it results from automated, non-volitional activity by Google servers;
- Field's conduct (failure to set a "no archive" metatag; posting "allow all" robot.txt header) indicated that he impliedly licensed search engines to archive his web page;
- The Google Cache is a fair use; and
- The Google Cache qualifies for the DMCA's 512(b) caching "safe harbor" for online service providers.
Keep in mind that this case was decided in FEDERAL court, a point that greatly increases its credibility as a landmark case concerning Federal copyright laws.
You can read the full text of the Judge's ruling at http://www.eff.org/IP/blake_v_google/google_nevada_order.pdf.
As you can see, this landmark case clearly states that caching a web site is not an automatic copyright infringement, and that a cache constitutes fair use under U.S. copyright laws. It also clearly states that the web site owner who does not want his web site cached must take steps to prohibit indexing and caching via a ROBOTS.TXT file or any similar methods available. (ROBOTS.TXT is the only current method I know of.)
Again, the court ruled that it is up to the web site owner to determine if distribution of their web pages will be different than the normal, default methods in use throughout the Web. (Those are my words, not those of the judge.)
I do not find this unusual. The same has been true for many years for printed text: unless the copyright holder specifically states otherwise, the copyrighted work is handled in the same manner as all other copyrighted works under current laws and court interpretations. If the author wishes something different, he or she must specifically say so. In a printed document, that action is accomplished by words in the printed copyright statement. It seems natural that in a copyrighted digital file, the copyright holder must specifically state exceptions by use of a digital file. With today's technology, the exception(s) are stated in a specific format within ROBOTS.TXT.
In all cases, the requirement clearly is on the copyright holder to specifically state the exceptions to industry norms, if any.
If the copyright holder elects to place their content on a web service that does not offer ROBOTS.TXT capability, that person has given implied licensing to others to cache the contents. Again, it is up to the copyright holder to make sure that the exceptions are clearly available to all. In this case, that obligation includes selecting a web service that allows for the clear statement of exceptions.
The court decision also refers to the DMCA's 512(b) caching "safe harbor" for online service providers. You can read an explanation of this at http://en.wikipedia.org/wiki/OCILLA. Pay close attention to the section entitled "Requirements to obtain safe harbor."
One item not mentioned in the Field vs. Google case is whether or not hiding the URL of the originating site makes a difference. This was not a factor in the Field vs. Google case, but many message writers felt it was a major issue with Ancestry.com's service. Although the question is interesting, the topic quickly became moot when Ancestry.com added URLs a day or two after product launch.
Does this precedent-setting ruling by a Federal court apply equally to the late Ancestry.com Internet Biographical Collection? I have no idea. I do not have the legal training to make a qualified guess. But I do know it makes for very interesting reading.
In the one legal case I found to directly reflect the same issues as the recent Ancestry.com controversy, the court clearly ruled that web site caching is not a copyright infringement. I am also guessing that, if anyone ever took Ancestry.com to court over the Internet Biographical Collection, the plaintiff would have to clearly show why the Ancestry.com web caching is different from Google's web caching. Otherwise, the judge probably would dismiss the suit by referring to the earlier Field vs. Google case. A clearly established difference would be critical to the new case's success.
Such a court case involving Ancestry.com should be an interesting case with far-reaching ramifications for genealogists. Since Ancestry.com has now canceled the service, I doubt if we ever will find out.
I am a lawyer and a law professor. I've been studying Field v. Google for the last week and there are some issues worth consideration. I'll be writing about those next week on GeneaBlogie (http://geneablogie.blogspot.com). I will point out for now that the case, as a decision of a trial court, is not binding on any other court.
Ancestry's problem may not have been as much of a legal one as a marketing and community relations issue. In any event, the business impact was just as severe.
Posted by: Craig Manson | September 01, 2007 at 11:52 AM
Thank you Dick, the Field v. Google case is an interesting one to be sure. It will be fascinating and instructive to read attorney Craig Manson's assessment of this case in light of Ancestry's recent activities. If I read the case correctly it is possible to not only add Meta-tags to prevent the search bot from collecting data from a given page, but also possible to allow the search bot to record the data from the page but prevent the caching of an image of that page? I certainly was not aware of this and will hazard a guess that many genealogical website owners were equally unaware. I for one have much to learn.
Posted by: Judy | September 01, 2007 at 02:17 PM
Ancestry has removed the database.
Announcement: “We have decided to remove this collection and search engine from Ancestry.com for the time being.”
Posted by: Wondering | September 01, 2007 at 02:31 PM
Thank you for the insight, Craig. I'll be VERY interested in the article you will write. When available, would you post a message here with a link that points to your article?
I agree with your assessment of "Ancestry's problem may not have been as much of a legal one as a marketing and community relations issue." While everyone has been screaming about the legalities, it really was more of a P.R. nightmare. It could be used as a business school study of "How to totally irritate your customers and cause ill-will throughout the community."
Thank you.
- Dick Eastman
Posted by: Dick Eastman | September 01, 2007 at 02:53 PM
Ancestry.com might try also reading the "Genealogical Standards & Guidelines: Standards For Sharing Information With Others." I'm not competent to address the legality of sucking up people's web sites and then republishing them without permission. Ancestry.com was providing a link to the original web site under the thumbnail image it showed.
Does anyone know whether this was a free database, or only available to subscribers?
I have tags in my for no caching and no archiving. I wonder if they respected those tags? My data are probably harder to use, as I include only PDF files for the huge amount of information I'm sharing.
Posted by: Patsy | September 01, 2007 at 03:36 PM
Judy, with respect to caching/archiving pages and how to prevent it:
The URL linked with the post will take you to a page that I set up to show you the tags to add to your page to prevent caching and archiving of pages. Tags also allow robots to follow and index your pages (for search engines). I also included a short "break frames" script to prevent your pages from being included in frames in someone else's pages.
Note that "bad" robots will ignore your tags and do what they want. Most reputable robots will respect your tags.
Posted by: Patsy | September 01, 2007 at 03:59 PM
I'm not a lawyer so I'll comment on my perception of the difference between Ancestry and Google. Google is a web site that provides a highly valued service to anyone who wants to use it, it allows searchers to find information on the web, and from a web site owner's point of view it allows potential users to find and use their site. Anyone can use it for free and without having to register. Although it does cache files it primarily returns links to the web site so searchers can go to the site. Not every web site owner likes the cache functionality but as the newspapers in Belgium found out, the traffic they receive from Google users outweighs the caching issue. The point is that people weigh the pros and cons of Google's business and the vast majority of people are glad Google is there. If Google maintained the "Google Information Collection" as a database of information on its own site that emphasized viewing on Google's site, the balance and hence the perception would almost certainly be considerably different.
With respect to the Field case, the judge listed a long series of findings of fact that do no apply to Ancestry including finding that Field, in full knowledge of Google's normal operation and the steps necessary to to prevent caching deliberately set out to get his files cached by Google, including explicitly placing a robots.txt file that gave permission to index the site. I don't think it is reasonable to assume that this case sets a precedent that covers all cases of caching by a service provider. Indeed, unlike Google, it is not even clear that archiving, which is what Ancestry was reportedly doing, is the same as caching, which is what Google does.
It would however be a great service to your readers to write up a summary of how to use robots.txt files and meta tags to prohibit caching or restrict it to particular bots.
Posted by: Lindsay | September 01, 2007 at 04:31 PM
---> Does anyone know whether this was a free database, or only available to subscribers?
When first launched, the service was available only to subscribers and did not include links to the originating web pages. A very strong, negative reaction from the user community caused Ancestry.com to change that within hours. URLs were added almost immediately and a day or two later the "pay wall" was removed, making it a free service. However, resistance from the user community was not reduced and the company then elected to cancel the service entirely.
- Dick Eastman
Posted by: Dick Eastman | September 01, 2007 at 04:35 PM
---> It would however be a great service to your readers to write up a summary of how to use robots.txt files and meta tags to prohibit caching or restrict it to particular bots.
That's a great suggestion although I probably will never do that for two reasons:
1. I don't have the expertise. I have never created a ROBOTS.TXT file myself as I have never had a need for one. Instead of trying to limit search engines, I have always tried to do the opposite: increase the exposure of this newsletter and my other sites on search engines. The last thing I would ever do is to create a ROBOTS.TXT file to limit them! (smile) I'd rather find a way to say "Please index and cache this site!"
2. It's been done already by many people. To see a couple of dozen articles that already describe the creation of ROBOTS.TXT files in detail, go to http://www.google.com/search?hl=en&q=create+robots.txt&btnG=Search
There are even a few sites that will create a ROBOTS.TXT file for you. One that I found is available at: http://www.hypergurl.com/generators/robotgenerator.html
Again, I haven't used any of the above myself and cannot make specific recommendations as to which is "the best."
- Dick Eastman
Posted by: Dick Eastman | September 01, 2007 at 04:58 PM
Perhaps we're all looking at the wrong issue and should be looking instead at the issue of deep linking. I didn't play with the search engine that much but the few pages I pulled appeared to be a direct result of deep linking within a website.
Case law on deep linking has see-sawed back and forth ever since the court ruling on the Tickmaster case in 2000. Currently courts in a number of nations (and we must remember that Ancestry, while based in the United States, now sells to an international audience) varies. Deep linking may be legal in one jurisdiction and not another. While I don't believe deep linking is currently illegal as such in the United States, without care you can still find yourself in legal hot water using the technique.
In Tickmaster (Ticketmaster v. Tickets.com, 2000 U.S. Dist. LEXIS 4553 (C.D. Cal. 2000) the court denied a motion to dismiss the copyright claim because the complaint alleged actual copying of web pages by defendants in order to extract factual data, although the court stated that hyperlinking itself does not violate the Copyright Act. Now it may be argued that in this case Ancestry did not "extract" the factual data but left it to the user to do it. Another point for the courts to argue but the use that was made of deep linking in this case remains a legal standard.
Another concern in a case of caching is are you caching identifiable copyrighted images unique to an individual site and storing them on your server for your use when you re-display the page. Remember that webmasters with an artistic bent probably spent hours on those images. Those without such a talent may have spent money hiring someone to do them for them. In Kelly v. Arriba Soft Corporation, 280 F.3d 934 (9th Cir. 2002) it was ruled the use of thumbnail pictures fell within the fair use defense because they were much smaller, low resolution images used for a different purpose than Kelly's works, which were artistic images used for illustrative purposes. As to the online linking and framing of Kelly's full sized images, the court held that the use had infringed upon Kelly's exclusive right to display the copyrighted works publicly.
And let's not forget the issues of frames and derivative works - something that any google search will provide ample court cases and judgments.
Court rulings change from year to year and many depend on fine points. I don't think I would want to risk riding on a fine point.
Was Ancestry violating any of these? I know what I think but what I think isn't important; it is what the courts would have thought. I know in having worked for a media company doing websites in the past our attorneys would have had fits with the presentation of this database. It does a company's bottom line little good to spend what profits something might generate defending it in court.
If there is a moral to all this, it's that just being big doesn't allow you to run over the smaller guys on the road even if you think you are right. And the bigger moral may be its not what you can do technically that counts, it's how you create it. And in this case the creation of a search engine may have been technically wonderful, but piggybacking off the research efforts of others without their prior consent just wasn't worth it.
Posted by: jking | September 01, 2007 at 05:21 PM
Ancestry.com goofed on TWO points.
1. paid subscription requirement to see FREE access sites including those ones NOT part of Ancestry/TGN network. This was the number one issue.
2. The urls were very unclear as to whether it is original site or cached, since modified but it too late for the damage was far greater caused by #1 issue.
Posted by: W. David Samuelsen | September 01, 2007 at 06:02 PM
Of course, Ancestry's collection is not a search engine at all.
But if it were, you would be spot on to call attention to the "Requirements to obtain the safe harbor"
A partial quote from that wiki page:
To obtain the safe harbor the OSP must:
* not receive a financial benefit directly attributable to the infringing activity
Major difference between a search engine and Ancestry.com here.
Google is not reselling access to your pages. Duh.
This demand is interesting too:
* have a Designated Agent registered with the US Copyright Office to receive notifications of claimed infringement (often called takedown notices).
By placing the IBC behind a membership wall, many web site owners will not even know they have been infringed.....
Posted by: Legal Eagle | September 01, 2007 at 06:13 PM
I am a graduate of the Seattle University School of Law and an LL.M. candidate. I weighed in on this dispute in the form of an allegory rather than an analysis of the case law. I blog my avocation because it uses the tools of my education, but doesn't seem like work.
(http://footnotemaven.blogspot.com/2007/08/just-because-you-can-doesnt-mean-you.html)
The last time I took a hard look at copyright law was in law school in 2000. Copyright law is a specialty and is not my specialty. Copyright is constantly evolving, evidenced by the Office of Copyright’s own in-depth look at “Orphan Documents,” another issue of importance to genealogists.
Asking us to lay down the law for our online community requires that we know of what we speak. As Craig said, he has been studying the case for over a week, and a week may not even be sufficient time. A lawyer and a law school professor will be very careful in offering opinions. We have certain ethical obligations.
I, like you, am looking forward to Craig’s very able analysis of the issues associated with the case you have cited. He is a fine legal mind, and a wonderful writer. I enjoy his family history blog.
Craig has very aptly noted that Field v. Google is a trial court case and therefore not binding on any other court. It takes a lot of education to be able to read a court case and understand what is the holding of the court and what is dicta. What the layperson thinks is clear may not be clearly binding.
Remember too, the opinions of lawyers are just that, opinions. A lawyer doesn’t know all the law. A lawyer does know how to find the law and analyze what has been found. Every day in every court in this country there are two sides with two differing opinions. The court determines which is correct and which is not and the court doesn’t make those decisions until the matter is before them. Even then, they may be overturned.
I hope that Ancestry becomes more cognizant of the effect of its treatment of the genealogical community and that they strive to become a better member of that community. Angering your income base just doesn’t seem prudent to me. As with the title of my allegory – “Just Because You Can, Doesn’t Mean You Should.”
Posted by: footnoteMaven | September 01, 2007 at 06:37 PM
Legal Eagle, Ancestry.com certainly appears to me to have a Designated Agent ready to receive claims of copyright infringement. http://www.tgn.com/default.aspx?html=copyright
I think it could have been argued that even when the database was behind the subscription wall that Ancestry.com was not "receiving a financial benefit directly attributable to the infringing activity." Ancestry.com was not selling access to that single database -- it was selling access to all ~25,000 databases on its site for a flat rate, all you can eat price. I really doubt that anyone would have signed up to access Ancestry.com just so that they could have accessed that one single database. I believe that the legal theory that would apply here is de minimis. http://en.wikipedia.org/wiki/De_minimis
In any event, this is all speculative, since Ancestry.com, presumably in an excess of caution, allowed anyone free access to the database.
It is also interesting to me that footnoteMaven has called upon Ancestry.com to become a better member of the genealogical community. To me, this smacks strongly of the pot calling the kettle black. Ancestry.com's publishing arm has been a part of the genealogical community since 1983, publishing books and magazines that have had enormous impacts on the field. When the free genealogical community failed to provide Rootsweb with the support it needed to survive, it was Ancestry.com that stepped up and kept it alive. Thousands of articles written by dozens of genealogical luminaries are available for free on Ancestry.com's Library. Exactly what more does Ancestry.com need to do to become a better member of the genealogical community?
From the most recent posts that on the genealogical blogs right now, the genealogical blogging community appears to be an insular clique focused more on patting each other on the back than they are on actually creating something to benefit the genealogical community. A cached database of all genealogical websites would have been an enormously benefifical thing for the genealogical community. It seems to inescapable that many of those who were the most strident were acting, at least in part, out of self-interest in opposing Ancestry.com's actions, not out of what would have been of the most overall benefit to the genealogical community. As I noted in a previous post, at least some bloggers, such as Pat Ritchie (Dear Myrtle), had the decency to confess this fact.
Posted by: Amanuensis | September 01, 2007 at 07:36 PM
If you think the legal issues with Ancestry "Internet Biographical Collection" revolve only around caching, you're overlooking some major differences between what Ancestry was doing and what Google does:
1. Google's cache is temporary-it's never older than 2-3 weeks. Ancestry never said how frequently they renewed their cache. That's legally critical. The decision in the Field case addressed only *temporary* caches, i.e., those maintained for a few weeks.
2. Unlike Google, Ancestry also created a frame around others' websites, giving the impression that the websites belonged to Ancestry. They did not warn the user that you were leaving Ancestry or post a notice in the frame that the website you were viewing did not belong to Ancestry. Many other reputable companies do that. The impression they create regarding the ownership of websites and their content is legally important.
3. Unlike Google, Ancestry compiled their cache into the "Internet Biographical Collection" to which they claimed copyright. Calling a collecton of cached websites a database is far different than providing links to cached web pages, which is what Google does.
A further important aspect of the Field case is that it was decided in a federal district court, not a federal appellate court or the Supreme Court. Therefore, its findings are only legally binding in that district, not nationwide.
"I do not find this unusual. The same has been true for many years for printed text: unless the copyright holder specifically states otherwise, the copyrighted work is handled in the same manner as all other copyrighted works under current laws and court interpretations. If the author wishes something different, he or she must specifically say so."
As many lawyers who have commented on the Field case have pointed out, having to take proactive steps to prevent the unauthorized use of one's copyrighted work (e.g., via a ROBOTS.TXT file) is the OPPOSITE of standard copyright practice, where the entity who wants to make use of your copyrighted work is obligated to obtain that permission first.
"Does this precedent-setting ruling by a Federal court apply equally to the late Ancestry.com Internet Biographical Collection? I have no idea. I do not have the legal training to make a qualified guess." You’re absolutely right. Stop trying to play lawyer. You have no idea what you're talking about.
Posted by: Oxa | September 01, 2007 at 07:53 PM
I have to say that I am surprised how interesting, and readable the Findings of Fact and Conclusions of Law were. Reading it top to bottom considerably improved my understanding of copyright and clarified my feelings about this case - thanks for posting it Dick.
"According to the United States Supreme Court, the fair use analysis largely turns on one question:
whether the new [use] merely “supersedes the objects” of the original creation . . .
or instead adds something new, with a further purpose or different character,
altering the first with new expression, meaning, or message; it asks, in other words,
whether and to what extent the new work is “transformative” . . ."
The findings go into much more depth but the part that resonates with and clarifies my objection to Ancestry's approach is that it seems to me that, unlike Google, they are presenting the copyrighted material for pretty much exactly the same reason/objective as the original web sites, and without adding much new. The claim that they will preserve material that is no longer available is the only thing they are adding. The judge notes that the purpose and intent of the cached pages on Google is different that the purpose and intent of the original publisher.
"Assuming that Field intended his copyrighted works to serve an artistic function to enrich
and entertain others as he claims, Google’s presentation of “Cached” links to the copyrighted
works at issue here does not serve the same functions. For a variety of reasons, the “Cached”
links “add[] something new” and do not merely supersede the original work."
In contrast, it seems to me that Ancestry set up their "collection" to serve essentially the identical purpose as the authors of the web sites that were harvested - to convey genealogical information to genealogists. To me this is just basically wrong and completely contrary to the entire intent of copyright. I don't go to Google to find genealogical data, I go to Google to find where I can find genealogical data. In the rare case where I use the cache file it is because the site is unavailable or I don't understand why it was in the search results. Google is not a "competitor" of the free genealogy sites in any significant sense. Ancestry on the other hand is. Preventing your competitors from just grabbing your copyrighted material is the very essence of copyright and Ancestry could hardly have been any more in the wrong. That they launched the collection without so much as a link to the original, only adding links in response to protests, really illustrates their lack of understanding, or worse. As I said, I'm not a lawyer and can't predict how courts would have ruled, but seeing the principles laid out enforces my opinion that what Ancestry did was wrong.
Posted by: Lindsay | September 01, 2007 at 08:12 PM
There are major differences in what Ancestry did and what Google does.
Is it not possible that one could assume the following?
1) Ancestry -- is in the Family History Business - (Google is not)
2) No incentive for user to visit my pages if they have them all.
3) Search engine status -- they get the hits I don't thus reducing my rankings.
4) More information on their site equates to more marketing blitz on how much info thus garnering more subscriptions.
And, last but not least, this was inexcusable and has done irreparable harm because many people now will remove their information from the web, or the ones that share, will share via password directories and behind email walls to prevent future thefts. Additionally people that go out and walk the cemeteries, search the libraries etc. will start back putting them into a book format where their information is somewhat protected.
Additionally, those that have paid for their subscriptions all these years feel extremely violated and wonder if they can not cache all of their data, then why did they feel like they were entitled to ours? Did I not read where they stopped the gentlemen that had the Ships site cold in his tracks even when he had previously gotten permission to utilize info? So keep on analyzing, maybe you will eventually get the BIG PICTURE, ---- there is a really, really, really,
big difference Dick, and I am so sorry you are still not seeing the whole picture. (Wonders why that is?)
Posted by: Pierce | September 01, 2007 at 08:17 PM
Pierce, let me address your post point by point.
Is it not possible that one could assume the following?
1) Ancestry -- is in the Family History Business - (Google is not)
Perhaps correct at the present time, but what difference does that make regarding copyright law?
2) No incentive for user to visit my pages if they have them all.
I disagree with this. Before Ancestry.com took its site down, I checked it for references to my rather rare surname. I then went to the original sites, realizing that, as is often the case with Goggle, that not all pages of a website get crawled, and that crawl results are always dated, so I went to the original sites to verify that the cached results still represented what was currently posted at the original sites. NONE of the sites I went to were sites that I would have visited that day had it not been for Ancestry.com. So, if anything, I think that Ancestry.com's database would have INCREASED visitors to the web pages they crawled. And obviously Dick Eastman feels the same way.
3) Search engine status -- they get the hits I don't thus reducing my rankings.
So you are saying that you care more about WHERE people find your information than simply whether or not the information is found? My mother used to tell me that a whole lot more would get done in this world if we cared less about who got the credit for it. After what I have seen and read this week on various websites and blogs, I have to say that I believe that many people in the genealogical community care more about who gets the credit than they do about the greater good.
4) More information on their site equates to more marketing blitz on how much info thus garnering more subscriptions.
And what is bad about that? More subscriptions = more money for Ancestry.com to add new content to its website = more contented subscribers = more renewals = more money for Ancestry.com to add new content to its website. In other words, a virtuous cycle that is beneficial to the entire genealogical industry. Ancestry's competitors would have never been able to get seed money to launch their firms from angel investors had it not been that Ancestry.com had already proved the viability of the model.
To address your point that people will now no longer be willing to share, well, it is my viewpoint that these people were already not willing to share, or they would not be upset about the fact that having their information cached on Ancestry's site would have meant far more people would have found the information than would have been the case otherwise. In any event, no one NEEDS to take such extreme measures. If they don't want their websites cached, all they need to do is say so in a robots.txt file. As several people, including Chris Dunham (The Genealogue) have noted, Ancestry.com itself has, for over a year, been notifying people that they were crawling websites and explaining how to opt out via the robots.txt file.
Posted by: Amanuensis | September 01, 2007 at 09:05 PM
It's not the caching that was the problem. It was the how.
Live links vs. complete page.
You do not get the hit. Sites with ads would not do business.
No comparison to the Google suit. When looking into the legality of this I hope you take into consideration the difference between the two actions.
Legally, if you take the small bio I have written about my grandfather, put it behind a wall and charge people to see it without my even knowing about it, take away my ability to communicate with the viewer or even knowing the viewer was looking at my family data and tell me if that is a copyright violation? Is it ethical?
I daresay they would not have taken it down without a fight if it had a solid legal base. Up to this point, this afternoon in fact, I had defended Ancestry. I won't anymore.
Posted by: G.C. | September 01, 2007 at 09:06 PM
It is evident that some of us have strong feelings on this matter, however I would think a diplomatic and friendly approach to disagreement would result in more supporters of your argument.
Bloggers for both sides were completely sane and capable of deciding for themselves how they felt about the situation, and we should respect their right to make that choice.
However, since no one here is omnipotent, I do not believe it is in our power to know any other blogger's intentions, footnoteMaven's, mine, or anyone elses. I don't believe anyone here has the psychic ability to know whether we were reacting selfishly or selflessly. Any statement of such is purely conjecture.
I respect Craig Manson very much and I am looking forwarding to his upcoming legal article about this matter.
There is an old saying about "beating a dead horse." It aptly applies to the topic of Ancestry.com's database, or search engine, or whatever you want to call it. It won't be coming back, at least in the form that we knew it before. Embrace the present.
Janice Brown
Posted by: Janice Brown | September 01, 2007 at 09:10 PM
Patsy, if you are still keeping it up.
the meta tags for the bots you posted on your link - do I simply replace Googlebot with MyFamilybot ?
I am not sure if this code is working or not with non-Microsoft (I don't use Microsoft for webdesign at all.)
(the breakout code is working very well - I ran test, courtesy of Mozilla Firefox site.)
Posted by: W. David Samuelsen | September 02, 2007 at 12:32 AM
To Share or not to Share ? (or should I say making a choice if to share - with whom) [apologies if this rambles!]
In common with many family historians I've spent many years and quite a bit of money on my hobby, in my case from the age of 13 or so, gathering information, weighing it up, and building my family tree.
Back in 2002 when I set up my (very basic) website I saw in common with thousands of others, an opportunity to make contact with distant cousins, and to share my research with a wider audience.
I also decided - given I had a lot of other resources available (namely hundreds of C18th and C19th English and Irish newspapers) that I ought to try to make these available as well.
Finally, in early 2007, I decided to add vital records from my One Name Study.
For if I fell under a bus, despite my best efforts, my research could easily be lost or dispersed.
All this was been possible through Rootsweb's generous offer of Free Webspace (supported by Ancestry)
The only "protection" I have, is a simple fair use statement. Is is legal - I havn't a clue ! It seems to work, sometimes I find my transcipts on other sites without acknowlegement - in which case I politely point out the source and conditions I've attached to the use.
And has it been worth it ? Well I've had a lot of positive feedback (not all but 99%) - For on occasion the parish register for someones ancester may be missing and my newspaper transcripts provide the gap - by showing what I've searched and importantly what records remain to search for my One Name Study - I get emailed scans and transcripts. And just the other week an email from New Zealand led to a cousin in Scotland, who has emailed family photographs (the sitters identifed) one of which matches a previously unknown photograph which I have, who I now know is my 2x Great Grandmother - born in Ireland c1832.
And so I would certainly be wary of trying to prevent search engines indexing my site ! Provided they give a link to the URL I win they win - I'm quite happy with my nose, and don't intend to cut it off (at least not without very good reasons)!
Best Regards
Richard H
Posted by: Richard Heaton | September 02, 2007 at 07:00 AM
Richard H., I agree with you.
I want to share my research with legitimate researchers and I am looking for cousin contacts. I too have spent many hours and a lot of money in my research and writing my histories and making my own conclusions. I created my web page and everything in it. My intent was that my pages be available to anyone interested. That they could remain anonymous if they so desired and to see my pages without cost, obligation or registration.
I was pleased when Google picked me up. However, if Ancestry or anyone else in any way makes it impossible for me to have full credit for my work; or if it is displayed in any manner other than what appears on my web page; or if there is any implication I am associated with Ancestry or anyone else - that would leave me with only one option and that would be to remove everything I now have online and stop updating.
What I write and how I organize it is mine. I don't tend to haggle over copyright law. I can't afford it. What I can do is stop sharing and for now that is what I intend to do. Until further notice, my pages will not be updated. I may even start taking them down. I will do what I can in order to stop my pages from being cached.
Wouldn't it be a shame if we all did that?
Kaye
Posted by: Kaye H | September 02, 2007 at 12:23 PM
I think this affected those ON The Genealogy Networks servers more specifically, you have to look at the whole process.
Those hosted on Rootsweb's servers, even those who were originally granted "virtual domains" by the original Rootsweb owners, were subjected to a new process launched with this "search engine". The domain went into every uploaded file and added javascripts which CHANGED the content of the sites pages. Adding a header and footer that stated the sites were hosted by Ancestry and putting in links, normally you would say "gee that's inconvenient" but these destroyed several of my sites which were already in frames.
The process continued with the copying of all those websites along with the others previously mentioned across the internet.
The result was a behind the pay wall "feature" that was designed to draw users in, advertising, right? NO.
This same search engine is incorporated into Family Treemaker. The FTM version displays the data from the public sites and allows your to directly merge the data into your file. The imported source information doesn't list the source website, it lists Ancestry.com's "Collection".
IF you have a subscription active you also can import the other locked results.
What this means is that Ancestry isn't just caching for search purposes, they are extracting and serving the data directly, bypassing the creator altogether. Remember Ancestry in their AUP indicates that anything on their server may be redistributed anywhere with in their system. They HAVE NO AGREEMENTS in force with the website owners out in the Internet.
Although this is a 'nice' feature for those who just use it, it should be carefully examined to see where the data is coming from.
Caching and serving directly data from the internet will reduce traffic to the original sites, and updates will be delayed in getting to the prospective viewers. reduced "hits" will cause many sites to be shut down.
As a Ancestry spokesperson stated the intent originally was to cache the data, because it is VALUABLE, and make it available through ANCESTRY, even if the website owner decides to take their site down.
In other words, ancestry was going to BY INTENT distribute the creation of others without consent despite the wishes of the authors.
I think THAT is a significant difference between the Google case and this one.
Posted by: Jeff Scism | September 02, 2007 at 12:35 PM
" '1) Ancestry -- is in the Family History Business - (Google is not)'
Perhaps correct at the present time, but what difference does that make regarding copyright law?"
In makes a big difference in the legal determination of whether a use is "fair use" or not. The copyright holder published the information in order to convey genealogical information. Google's cache is not intended as a source of genealogical data, it serves various other uses, this makes it "transformative" and therefore "fair use". Ancestry's collection on the other hand IS intended to convey genealogical information, which means it isn't "transformative" and doesn't qualify as "fair use".
Try to imagine NBC capturing the programming broadcast for free by CBS and creating a "cache" of CBS's content on NBC's web site, complete with a "search engine" that makes it easier to find CBS's content and ensures that the content will still be available if CBS doesn't rebroadcast it. I think NBC would have a hard time convincing a court that this was "fair use".
Posted by: Lindsay | September 02, 2007 at 01:22 PM
Lindsay, as far as I know, Ancestry.com did not cache any webpages from any other online genealogy database business, so your argument is moot. While Google may or may not cache content from, say, Yahoo, it certainly does cache content from people who, for example, write about search engine optimization. What I am saying is that the subject matter of a person's website probably would not matter in looking at whether caching of the website was permitted. While many of us have an avocation in genealogy, and some of us are in fact professional genealogists, very few of us operate businesses in the exact same niche as Ancestry.com.
As I view it, Google's cache is intended to make the information cached findable even if the website from which the information was crawled goes away. Ancestry's cache was intended for exactly the same purpose.
Posted by: Amanuensis | September 02, 2007 at 03:09 PM
Jeff wrote, "Caching and serving directly data from the internet will reduce traffic to the original sites, and updates will be delayed in getting to the prospective viewers."
We have no way of knowing whether updates would have been delayed in getting to the prospective viewers, since the database was not up long enough to know how frequently any particular site was going to be re-crawled. As for traffic being reduced to the original site, I think for many people that would be a good thing -- reduced bandwidth needs would result in lower ISP charges, which would result in less need for fundraising, which would result in more time for that person or society to spend in adding new information to their website.
Posted by: Amanuensis | September 02, 2007 at 03:15 PM
METATAGS - NO CACHE, NO ARCHIVE, ETC. SAMPLE TAGS PAGE UPLOADED AND LINKED TO SIGNATURE ON THIS POST
Dave and Judy,
Dave, you don't have to replace anything. I specified the Google bot because it's the search engine I target the most. You can add the other bot if you want, but it isn't required. You couldn't possibly list each one.
For those who didn't read the earlier post:
I posted metatags for preventing (some) page caching, (some) archiving and a "break out of frames" small script for any of you to use if you want to. I use these on my own pages. Dick also linked to excellent tutorials on robots.txt, which I also use in the root of my server.
My example page is at
www(dot)websitewiz.com(forward slash)metaTags(dot)htm
You can also click on "Patsy" in this post to go to the page.
Posted by: patsy | September 02, 2007 at 04:08 PM
URL is incorrect in my 4:08 PM post.
Bad, bad fingers.
Correct URL is:
www(dot)websitewiz(dot)com(forward slash)metaTag(dot)htm
Sorry for the inconvenience.
Posted by: patsy | September 02, 2007 at 04:12 PM
Thank you Patsy,
Because of my admitted and obvious ignorance of caching, metatags, linking, frames, internet copyrights, terms of service, etc, I have removed my once freebie content at rootsweb. I did leave contact info so researchers can find me for a free lookup. For now, until I learn more I've pulled it off. I am seriously thinking of giving it to familysearch.
I do have one question. Are the webpage copyrights we post simply a waste of time? Dick does your copyright reflect your current feelings. The one I have is very similar and others whose info was cached by ancestry had copyright notices as well. Based on what I've seen in the past few days they are a waste of cyber ink.
Posted by: Judy | September 02, 2007 at 04:48 PM
I'm not really concerned with ancestry's "intent". Say that I plan to build a website on the Williams surname and intend to have all information about everyone who ever lived with the last name of Williams on my website. So, I visited Ancestry.com in the free and paid portions and copied all the information I could find on everyone named Williams - and posted it on my website - so that all the information available about the name Williams is right there in one place. Should ancestry go out of business later on, I have preserved that information - in it's original form - posted right there on my pages. My intent may be good, but my actions would be totally wrong.
"Fair Use" normally refers to using a small portion of the overall content, and doesn't apply when entire sites are cached.
Ancestry cached and re-served copyrighted pages, first without even a link to the live site - then with a link buried under the "view the cached version of the site." They not only violited my copyright, the copyright of other genealogy webmasters and bloggers, but the copyrights of folks like Biography.com and the Boston Herald. Hundreds of their pages were in the cache.
Someone asked that as long as the researcher found the information, does it really matter where they found it.
You bet. If I'm going to spend hours researching and transcribing and adding material on line - then they can search Google and go to my site to find it.
And Amanuensis, if I wanted less traffic to my website, I'd just take the site down and spend my free time playing golf.
Posted by: Chris Martin | September 02, 2007 at 05:37 PM
>>>If they don't want their websites cached, all they need to do is say so in a robots.txt file. As several people, including Chris Dunham (The Genealogue) have noted, Ancestry.com itself has, for over a year, been notifying people that they were crawling websites and explaining how to opt out via the robots.txt file.<<<<
Adding a robots.txt file only helps before the robot visits the site.
Chris Dunham pointed out that My Family robots had visited sites.
www.genealogue.com/2006/10/ancestrycom-has-biography-bot.html
and provided a link to a page on ancestry that explained the MyFamily bot in minimal detail.
He didn't say that "ancestry notified people that they were crawling websites" and I doubt that there's anyone that ancestry notified.
Posted by: Chris Martin | September 02, 2007 at 05:56 PM
"Lindsay, as far as I know, Ancestry.com did not cache any webpages from any other online genealogy database business, so your argument is moot. While Google may or may not cache content from, say, Yahoo, it certainly does cache content from people who, for example, write about search engine optimization. What I am saying is that the subject matter of a person's website probably would not matter in looking at whether caching of the website was permitted."
What I don't understand is why you are saying this when the judge who wrote the findings we are discussing said the exact opposite - i.e. that the intent of the copyright holder in publishing, and the intent of the service "caching" the files is important in determining whether the use is "transformative" and therefore fair use. The intent of the copyright holders was to publish information of interest to genealogists, the intent of Ancestry in publishing the same material was the same, to provide that same information to genealogists. I encourage you to read the judge's findings, they are actually quite readable and very interesting and enlightening. I think anyone who carefully reads the judgment will come to similar conclusions as I have. The judge goes into considerable detail in listing the reasons that Google's caching functionality is transformative, and the analysis he applies would not work favor Ancestry. If someone has come to different conclusions, based on the judge's findings, I would be genuinely interested in hearing their analysis.
Posted by: Lindsay | September 02, 2007 at 06:11 PM
---> Dick does your copyright reflect your current feelings.
Yes. Search engines, archive.org and others are always free to index and cache all publicly-available web pages on any of the web sites I own, including this one. In addition, anyone is free to republish the publicly-available articles here for non-commercial use, both online or in print. In addition, as I state in my copyright statement, even commercial web sites are free to republish the articles here, if they ask. I usually say "yes."
I wrote an article about that not too long ago at http://blog.eogn.com/eastmans_online_genealogy/2007/04/free_content_fo.html
Ancestry.com's latest service did index and cache the pages from this web site. That was the first thing I checked. I was happy to see the articles listed on the Internet Biographical Collection as I need all the publicity I can get.
I did have a problem with their original implementation that hid URLs but Ancestry.com changed that a day or two after the product launched so it quickly became a moot point. There's nothing unusual there with any new service: web site owners often adjust their new software based on user feedback. Ancestry.com did the right thing: they listened to their customers and changed it quickly, thereby eliminating that problem.
If you own a web site, please do me a favor: re-publish some of my articles. I'd love the publicity. For details, read my copyright statement (that hasn't changed in years) at http://blog.eogn.com/eastmans_online_genealogy/2005/01/copyrights_and_.html
Thanks for asking!
- Dick Eastman
Posted by: Dick Eastman | September 02, 2007 at 07:25 PM
> "Fair Use" normally refers to using a small portion of the overall content, and doesn't apply when entire sites are cached.
Before I read the judge's findings I would have agreed with you, however the judge carefully explains that if the "new work" requires the complete work in order to accomplish its goal then copying the complete work can still be fair use, as in the case of Google's cache, which also copies complete pages and indeed complete sites. If one stops to think about the very term "fair use" and the intent of copyright law itself, one can see that fairness depends on several circumstances. If you have two web sites competing for the same audience and one site lifts an article off its competitors site and puts it onto its own site that is pretty clearly unfair and is pretty much exactly what copyright law is intended to prevent. If on the other hand, a web site which provides a totally different service takes that same page and uses it for a purpose unrelated to the purpose of the originating web site, and in so doing delivers a benefit to society without harming the originating site then that may well be considered fair use. I would argue that what Ancestry did fell closer to the first end of the spectrum, while the judge argued that Google's caching falls to the latter end of the spectrum.
If you want to disagree with a federal judge that's fine, but the onus is then on you to point out his error. I don't need to disagree with the judge to come to my conclusion that Ancestry was in the wrong both ethically and most likely legally.
Posted by: Lindsay | September 02, 2007 at 08:32 PM
Hi Lindsay,
I copyright my web site and the documents on it, and I think that's worth doing. The "facts" can't be copyrighted, as they are from public records, but the way I put information together, and my narratives are my own and I can copyright those. The problem is that so many people just cut and paste, never soure me, and ignore all copyright information. It's very expensive to hire a lawyer to sue someone for copyright violations.
I happy for people to copy my material into theirs so long as they indicate me as the source. I have spent so much time on the "narratives," and it is most upsetting to find those in people's family tree's on Ancestry.com (and WFT), with no source cited. I know it's "my" narrative because it even includes my typos :) I am always ready to share with a distant cousin, but I never send anything on my family later than 1900 to anyone. I won't send GEDCOMs at all.
Basically, if you care if people copy it, you shouldn't put it on the Internet.
Posted by: Patsy | September 02, 2007 at 11:07 PM
Shoot, the last post should have been address to "Hi Judy," since she asked me a question.
Sorry.
Posted by: Patsy | September 02, 2007 at 11:09 PM
David Samuelson asked about using metatags with Microsoft, etc. Metatags are platform independent and so is the "break frames" script.
Posted by: Patsy | September 02, 2007 at 11:13 PM
I believed we had a standing agreement, that they have apparently unilaterally changed...
from 15 August 2000
Dear Blacksheep Society:
By now, news of MyFamily.com's acquisition of RootsWeb has made its way through the genealogical community. We are thrilled with the overwhelming positive feedback we have received.
I would like to take this opportunity to assure the Blacksheep Society of MyFamily.com's desire to continue our support of your project.
And in response to some of your questions, we will make the following pledges to the Blacksheep Society:
* MyFamily will continue to support The Blacksheep Society with free, unlimited server space.
* MyFamily will not place advertising banners on The Blacksheep Society pages without the express permission of the page creator.
* MyFamily will never charge for access to any Vanity Domain data, nor will we use it in any way other than its current use, without the express permission of the contributor and/or the Project.
* MyFamily will continue to strive towards open communication with The Blacksheep Society.
On a related subject, I want to reassure you that all of the current RootsWeb features, including mailing lists, message boards, user databases, and WorldConnect GEDCOMs, will continue to be FREE. And while we will no longer solicit financial contributions, we will continue to honor all commitments to current contributors for tagline-free mailing lists, banner-free home pages, use of Personalized Mailing Lists (PML), vanity domains, etc.
We appreciate your on-going support of RootsWeb, and look forward to a long relationship.
Sincerely,
Curt Allen
Chairman of the Board MyFamily.com, Inc.
Posted by: Jeff Scism | September 03, 2007 at 12:27 AM
For one thing the legal case cited about Google is far different because Google does not cache files for the same reason stated by ANNA FECHTER, in charge of the "Internet Biographical Collection." She said they were copying the files and that if an owner decided to take down the files, they would still have a copy of it to present. Google does not do that.
Please note that the name "Internet Biographical Collection" is a misnomer; FAR more than biographies and genealogies were presented there. ALL of my Civil War work, in my own domain, including rosters and recrods of the state Adjutant General, were taken.
And according to the USGenWeb National Coordinator, our files stored on Rootsweb were not supposed to be copied/cached by Ancestry.com. We were invited to store our files there before they bought Rootweb and when the issue has been raised about "Fair Use" we were repeatedly told that our files were ours and they would not archive them. - They are NOT part of the "Internet Biographical Collection" I might add and they are STILL on Ancestry.com as I write this -- at least mine are.
But the worst thing is that NOTHING is safe from Ancestry.com because they took everything from the Internet that serves their purposes. Hundreds of thousands of my files were taken that were stored in my own domain space.
Posted by: Linda | September 03, 2007 at 10:05 AM
"Basically, if you care if people copy it, you shouldn't put it on the Internet."
This is the very heart of the intent of copyright law, or at least it's inverse is! The purpose of copyright is to allow people to share their works, to the benefit of society, without fear that their works will be appropriated by others. Ancestry's actions were a flagrant violation of the intent of copyright law and more than likely the letter of the law, and society has been harmed by the production of the above sentiment. I quite suspect that Ancestry's lawyers are asking Ancestry "what were you thinking!?!".
Another way to look at the distinction I was drawing earlier about intent and purpose is to simply divide people and corporations into "content providers" and "non-content providers". The law has long recognized "common carriers" and other classifications that recognize this basic distinction. The safe harbor provisions that were covered in the findings and have been mentioned here for example recognize that ISPs and communication companies are not content providers, they only reproduce copies of works in a mechanical (non-volitional) way. If a mail server couldn't make a copy of an email containing copyrighted material it couldn't deliver that email. Photocopy manufacturers are not responsible for illegal copying. Phone companies are not responsible for frauds carried out over their telephone lines. Under safe harbor provisions an ISP is not responsible for copyrighted materials place on its servers by its customers as long as it wasn't volitionally involved and as long as it has proper procedures in place to remove such materials when notified. A key distinction here is that none of these entities are content providers that potentially compete with the copyright holder as a source of the copyrighted material. The judge held that Google was not acting primarily as a content provider and took several steps to ensure that it impacted the copyright holder to the minimum extent possible while providing search capabilities that are of benefit to society.
Ancestry on the other hand is a content provider, and it failed to take several of the steps that Google did, and made the probably legally fatal mistake of labeling the source of that data as itself, so it is in a much much weaker position to claim fair use.
The furor seems to have died down significantly, if Jeff Scism's description of Family Tree Maker functionality is correct it will be interesting to see whether genealogists will demand an end to that functionality as well.
I think a really interesting question is whether there is a way that Ancestry could have done this that wouldn't have caused all the problems that they did by doing it this way.
Posted by: Lindsay | September 03, 2007 at 12:45 PM
I really, really hate to sound stupid, but would someone please explain caching to me?
Posted by: Loreen Wells | September 03, 2007 at 12:50 PM
I have to say that I am somewhat dubious that the concept of "de minimis" could be used to say that the Internet Collection is a tiny fraction of Ancestry's content so it is ok for them to copy someone else's work in its entirety. As outlined on Wikipedia, de minimis applies in respect to the size of the copied work. If you make very small changes to the copied work you are not entitled to claim copyright on your modified version for example. Or if you take a very small portion of a copyrighted work. It will be a very sad day when a large content provider can copy entire works as they please and successfully defend themselves by saying that the works represented an insignificant portion of their content.
Posted by: Lindsay | September 03, 2007 at 01:05 PM
Caching:
When you view a web page, it is cached in your web browser for a length of time (set up in your browser options). So, if you go back to a page during that time, for the most part, you see the page that is cached. It saves bandwidth.
Be sure to delete your temporary internet files regularly (they are cached on your computer). You'd be surprised how much space they can take up. It can slow down your machine significantly. For IE: Tools>Internet Options>Temporary Internet Files>Delete Pages.
Caching by Google, etc. Basically the same thing. They are saving a page, "cached" at a particular time. It loads faster.
Downside: you aren't necessarily seeing the current page.
AOL is particularly bad about caching pages (using the AOL browser, of course). AOL also re-optimizes images (makes them smaller), and thus is widely scorned in the web development community. Ruining my images also saves bandwidth.
Quoting from marketingtermsDOTcom - their definition of cache in their dictionary:
"Definition
"The storage of Web files for later re-use at a point more quickly accessed by the end user.
"Information
"Caching can happen at many places, including proxies (i.e. the user's ISP) and the user's local machine. The objective is to make efficient use of resources and speed the delivery of content to the end user.
"While caching can have a positive impact on the user's experience, it can have a negative impact for site publishers, resulting in undercounts of page views and ad impressions. In response to this problem, sites have implemented various cache-busting techniques to better ensure that all performance statistics are accurately measured."
Posted by: Patsy | September 03, 2007 at 05:59 PM
You know, I think the biggest things that were the problem were:
-the frame around the page, the marking of it property of Ancestry.com
-the fact that they were reselling the data, unlike Google
-the fact that they copyrighted information that is available free elsewhere without notification of the copyright change.
AND
-THAT THEY DIDN'T INSTRUCT YOU ON A WAY OUT OF THEIR PROGRAM. Google's cache can be easily interrupted with a simple text file that they can show you. Ancestry's program would have required us to figure out the pertinent details so that WE, THE USERS, had to figure out how to opt OUT of an unneccesary, badly written database/
I'm totally disgusted. I hope Ancestry dies a slow death and gets bought out by a better run company.
Posted by: Concetta | September 03, 2007 at 07:18 PM
It comes down to this with me--I create and publish my family history because I want to--I also want total control over the content with my name on it, therefore I pay for webspace. We all make mistakes especially in this family history game and if you have bad information out there and no control over it, you can't correct it. Paying for my own webspace allows me to correct it but if you cache my page and it has mistakes you have taken the control away from me.
Posted by: Donna | September 04, 2007 at 07:44 AM
Let's turn this into a positive. Online services provide access to materials in a few minutes that in previously took months or years to accumulate. I for one am grateful for that.
And since Ancestry (or their parent company) is a for profit entity, they do deserve to make a profit. So let's help them by showing them a simple solution out of this mess.
I suggest all online services that provide message boards and databases to authors for free, add tagline copyrights in the way of simple checkoff boxes. And in the meantime, if they don't provide this service, simply add your own. Example.
__ 1. This material or message is hereby submitted to the public domain, (name & location & date).
__ 2. (c) copyright 2007 by (name, location), all rights reserved.
__ 3. (c) copyright 2007 by (name, location). Limited rights granted.
__ Unlimited personal use allowed, and reprints with proper sourcing (provide example).
__ Commercial or access charges may not be assessed without prior permission.
Let's help them expand on this idea, and by the way, I do grant unlimited reprinting of this idea to the public.
Posted by: Mary H-S | September 04, 2007 at 09:00 AM
I guess what this all boils down to is what is legal and what is ethical.
You know, just because you can legally do something doesn't make it ethical.
Ancestry may or may not have been within their legal rights, but it is fairly evident that their idea of ethics varies from what seems to be the majority of the genealogical community.
They've put a very large blot on their reputation that will take them quiite a while to erase.
Posted by: Dino (All Dino, All the Time) | September 04, 2007 at 11:48 AM
I agree with Dino. There's a world of difference in what is legal and what may be ethical, or simply the right thing to do. It sounds as if the company needs a better understanding of the genealogy community, their customer base. There are accepted standards and ethics in most any area of study.
I simply don't understand the need to go all over the Internet caching others' work and framing the cached content onto another site in their "collection." There are several issues involved here, one possibly being that some other sites receive revenue from advertising and those clicks bringing in revenue could possibly be diverted from others' work onto their company site.
With that said, it would seem a company that requires paid membership would devote more time in obtaining and producing digital copies of more source records that have never been published for their customers rather than caching information readily available free to anyone by simply doing a basic search on most any free search engine and that their customers would expect the same.
Posted by: Bob | September 08, 2007 at 07:20 PM
Great job on the legal background info, Dick! While I feel that the way the IBD was first launched varied quite a bit from the Google vs. Field case, I was also very impressed with how rapidly The Generations Network responded to criticisms and made adjustments to the database - making links to the live Web site as prominent as the cached pages and moving the database to their free offerings, for example. They responded extremely well to a harsh backlash from the genealogy community and I was sorry to see the database come down. I think the biggest question regarding caching of content will ultimately lie with the "intent." Ancestry.com's original intent was to "preserve" the content in case the Web sites were moved or removed. A noble intent to be sure. But legal? I don't know. It's a little different than the Google vs. Field case.
You said in your post that "I also noticed that nobody cited legal precedent, at least not with a case title and source citation." I just wanted to say that while I may not have *formally* cited legal precedent, I did cite the title of the Google vs. Field case, and linked to a news article on the ruling in my first blog post on the subject (the morning of August 28) "Cache 22 - Has Ancestry.com Gone too Far?," including a quote from the ruling of Judge Robert C. Jones. I also discussed the legal issues further with more in-depth links regarding the Google vs. Field case, as well as the "safe harbor" section of the Digital Millenium Copyright Act in my second post the morning of August 29 - "The Legality of Caching." While I did link to previous legal precedent, however, I am not a lawyer (as you so aptly noted), and the opinions expressed were my own (as I said in the blog posts).
Posted by: Kimberly Powell | September 09, 2007 at 09:39 AM
Dick,
Not surprisingly, you have many very astute commenters. This has been a worthwhile dialog. My legal contribution is in four parts, starting at http://geneablogie.blogspot.com/2007/09/did-ancestry-violate-copyright
-law.html
Posted by: Craig Manson | September 09, 2007 at 08:01 PM
Amanuensis's "signature" links to http://www.apgen.org/directory/search_detail.html?mbr_id=820, the entry for a professional genealogist named Chad Milliner.
I googled "chad milliner." The first result is a listing for Chad Milliner on zoominfo. It says that Chad Milliner is a Content Specialist for THE GENERATIONS NETWORK, aka MYFAMILY.COM aka ANCESTRY.COM!
Posted by: Joy Rich | September 19, 2007 at 03:53 AM