« WeRelate: a Video Interview | Main | Family Tree Maker 2008 Service Packs »

September 09, 2007

Rules of Posting Genealogy Information Online

The past few weeks have been fascinating. We saw Ancestry.com deliver a search engine that focused primarily on genealogy resources. The service was designed to simplify the process of finding family history information that many people would not be able to find easily because it is often scattered among numerous websites across the Internet. However, the service quickly became controversial as genealogists discovered various features that some felt were inappropriate, including caching of web sites and the use of HTML frames that hid the origin of a page obtained from another web site. Ancestry pulled the service after only a few days.

I watched the various comments fly back and forth with somewhat mixed emotions. I agreed entirely with some of the messages posted and disagreed with others. More than a few surprised me. In numerous cases, I thought to myself, "That's the way everyone does it. Not to say that it is right or wrong, only that it is common practice."

While Ancestry.com caught the heat recently, everything I will write about in the rest of this article applies to all search engines, all message boards, all online genealogy databases and to all publicly-visible web pages. It applies to the now-defunct Ancestry.com Internet Biographical Collection as well as to Rootsweb, OneGreatFamily.com, FamilySearch, Google, Yahoo, Dogpile, Alexa, HotBot, AOL Search, Lycos, and the genealogy-specific search engine at WeRelate.org. These are ideas I would like to share concerning what to do and what not to do when you place information on the World Wide Web.

While I call these "rules," they are really suggestions. These "rules" are just a start. I suspect you can think of additional "rules." If you can add more, please post your suggestion(s) in the comments section below.

OK, here are "Eastman's Rules of Posting Genealogy Information Online," a new set of rules invented today:

  1. If you don't want everyone to know about something and use that something as they wish, don't post it online! There are no secrets after you post information online. You can claim copyrights or legal protection, but the fact remains that information placed on the web quickly becomes common knowledge. You may be correct in thinking that nobody else should ever reuse your information, but not everyone will agree with you. Regardless of your intentions, some people will re-use your data elsewhere. Getting the data removed later will be difficult and frustrating. Think before you post!
  2. Keep in mind that all search engines will index your site (unless you take steps to do otherwise as listed in Note #1 below), and most of them will cache the information. One web site (www.archive.org which is not a true search engine) will cache your data more or less forever, even if you later change or remove your data.
  3. A few specialty search sites will charge their subscribers a fee to search your site and millions of others. General-purpose search engines, such as Google, are usually free to the user. Specialty search engines that look only for financial data, legal data, real estate transactions, sports scores, etc. typically charge a fee. The more specialized the search engine, the higher the fee. Some charge very high prices. You and I don't hear much about the fee-based search engines, but they exist, nonetheless.
  4. Facts are not copyrighted, at least not under U.S. law. If your web page contains only names and dates and locations of life events (birth, marriage, death, census entries, military service, etc.), you do not own that information. It is public domain.

If your page(s) contains additional descriptive information, interpretations, stories, or other information that you wrote, the original information you added might be copyrighted. However, the dividing line between copyrighted information and public domain information is often fuzzy. Even legal experts who specialize in intellectual property issues often disagree with each other. You should realize that not everyone is going to agree with your interpretation of the legal issues involved.

Actually, all of this is probably a moot point anyway. Whether legal or not, it is very difficult to force someone to remove copies of information you supplied.

Never assume. You may have strong opinions concerning what is right or wrong, but not everyone will agree with you. Ask yourself, "What will happen if I place this information online?" Be realistic!

The above are a few of my thoughts. Again, if you have further suggestions for additional "rules," please post your thoughts in the comments section at the end of this article.

Note #1:

If you do want to place genealogy information (or any information) on the World Wide Web and do not want your information to be found by search engines, there is a simple way to do so: create a ROBOTS.TXT file and place it on your web site. Thousands of web sites do this already when they don't want certain information to become too public. There are many web sites that will explain ROBOTS.TXT and tell you how to add such a file to your site. Start here: http://www.google.com/search?source=ig&hl=en&q=create+robots.txt+file&btnG=Google+Search. Once you add a ROBOTS.TXT file to your web pages, your information will disappear from all search engines within a few months. However, don't be surprised if nobody visits your site anymore. It will be rather well hidden.

If you are willing to have some search engines index and cache your site but do not want all search engines to do so, you can be selective. Again, the solution is a ROBOTS.TXT file. You can exclude specific search engines by name. The format of the commands is a bit tricky, so study the instructions carefully. Start here: http://www.google.com/search?source=ig&hl=en&q=create+robots.txt+file&btnG=Google+Search.

Note #2:

You should realize that search engines are not perfect. Even the specialty search engines designed for a specific purpose will erroneously add some extraneous data. The search engine's filters may interpret words differently than a human would. For example, a financial services search engine might add your genealogy data to its search engine if your ancestor was named James Penney or Ezekiel Dollarhide. Likewise, a genealogy-specific search engine may add a page that describes the "roots of New Orleans jazz," and a real estate search engine may add information about "the history of the House family."

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

You could have posted your rules without devaluing yourself by posting pro-Ancestry advertorials.
How many people have pointed out to you now it wasn't a search engine but a collection!?


> "If you don't want everyone to know about something and use that something as they wish, don't post it online!"

Duh. Dick, the whole discussion was about us sharing our information freely with our own visitors, and then Ancestry.com coming along to steal our content and resell it as if it were their own collection. Did you miss that point altogether?
And just what is your point here? Is the Dick Eastman rule of copyright that you have no copyright if you post on a website? Or that letting everyone copy your website works fine for Dick Eastman and therefore should be good enough for everyone?


"acts are not copyrighted, at least not under U.S. law. If your web page contains only names and dates and locations of life events (birth, marriage, death, census entries, military service, etc.), you do not own that information. It is public domain."

But Ancestry.com is all facts....
So under U.S. Law nothing on Ancestry.com is copyrighted?
We all need just one subscription and can repost all of it anywhere we like for those who don't wanna pay their outrageous fees?

You said: "f your page(s) contains additional descriptive information, interpretations, stories, or other information that you wrote, the original information you added might be copyrighted." Might? In what cases would original stories, descriptive information and such not be copyrightable by you as the creator and the author of the original work?

And as far as sharing information online and posting such, there's a big difference in creating websites and posting such online and enjoying visitors coming to YOUR site and interacting with YOU, rather than some big company caching your work and placing it on their site in frames and calling it THEIR "collection."

To me, your comments on the issue are quite perplexing.

> To me, your comments on the issue are quite perplexing.

In one of the other threads, Dick proudly claims to be friends with Ancestry.com management.
I think that's your answer. That's why all his posts on this matter have a heavy pro-Ancestry slant.

And yet in other comments about past articles, I have been accused of having an anti-Ancestry.com bias. The mix of comments tells me that the articles are about balanced. (smile) Indeed, I like some of the things Ancestry.com has done and dislike some of the other things the company has done.

- Dick Eastman

---> In what cases would original stories, descriptive information and such not be copyrightable by you as the creator and the author of the original work?

According to intellectual property experts I have talked with, there is much disagreement about this even amongst the experts. Many of them will claim that "minor additions" will not hold up in court as falling under copyright protection. The phrase "minor additions" does not seem to be defined very well anyplace I have looked.

I know I would not want to be involved in litigation on either side of this issue!

- Dick Eastman

I am no expert on meta tags but here is a simple meta tag that will stop caching that you can add to your webpages http://www.i18nguy.com/markup/metatags.html . If you don't tell the spiders not to do something they are going to do it.

I think if you do some research you can find a script that will limit certain companies from using your info.

How can we expect google, ancestry or any search engine to know what we want them to with our site without telling them?

Dick,

Not all cached pages were in questionable territory. There were in excess of 2000 pages cached from the USGenNet server. We are a 501(c)(3) nonprofit web hosting server and have published a Conditions of Use Policy for more than 7 years which is applicable to all pages located on our server. Here's a pertinent excerpt of that policy:

"You may not use USGenNet’s services or resources, including any websites, mailing lists, message boards or other Communication Networks, to publish, post, transmit, distribute or disseminate the proprietary information of any others, including trademarks or copyrighted information, without express authorization of the rights holder. You may view, download, and print material from this site only for personal, noncommercial use."

---> So under U.S. Law nothing on Ancestry.com is copyrighted?

No, no, no, no. I never wrote that or inferred that. Do not focus on one statement and then try to apply it to an entire site.

Again, "Facts are not copyrighted, at least not under U.S. law." Every first-year law student knows that. However, Ancestry.com and other sites have facts and a lot, lot more.

For further reading, start reading the laws involving COMPILATION copyrights. Compilation copyrights is something you rarely see discussed on genealogy message boards but are critical when discussing large compilations of information.

Next, you must read the licenses of each online site in question. Ancestry.com's statements are available at http://www.ancestry.com/legal/Terms.aspx

Just because you might have a right under copyrights doesn't mean that you can ignore licenses. Your rights always are dependent on both the laws and the applicable licenses. Never focus on one specific statement while ignoring all the other factors involved.

That will be the end of my comments on the legal issues as in the article I only repeated things that are "common knowledge" and have been proven in court time and time again. For more detailed "what if" scenarios, please contact a legal professional who specializes in intellectual property issues. He or she can answer your questions far better than I can.

- Dick Eastman

There is a big difference between original stories and "minor additions" to facts though. An original narrative, story, article, poem, work of art, song lyrics or whatever is just that -- original and I would suspect is fully copyrightable and such is being copyrighted each and every day. Did this nifty search distinguish between such original works from just mere facts before the same was cached, placed in frames and added to their "collection?"

And I will go back to my original feelings. I simply don't understand why they felt the necessity to scour the Internet caching works freely available to anyone via a good free seach engine like Google and make such a "collection?" One would expect, and most especially their customers, they would devote their energies to suppling material to their customer base not readily available free online like digital images of source documents and the like.

One final, final comment: Please re-read what I wrote in the article:

"...the dividing line between copyrighted information and public domain information is often fuzzy. Even legal experts who specialize in intellectual property issues often disagree with each other. You should realize that not everyone is going to agree with your interpretation of the legal issues involved."

Dick,
I appreciate you putting into words what all of us should know or at least suspect about anything we post on the internet. Some people will respect our rights, and some people won't. Your rules or reminders (or warnings)might be helpful for anyone who doesn't think about such things before posting. I hope that you won't react as negatively as some of those who have commented when people use your rules or parts of them or their versions of them on their blogs or when making presentations on this topic to individuals and organizations.It is a positive when we can share good ideas, and, after all, sharing information with others is the beauty and utility of the internet. I won't get into the Ancestry thing because with politics, religion, Microsoft, and, yes, Ancestry there seems to be no middle ground and no reasoned discussion in which we respect the views of others with whom we do not agree. It's just become shouting, posturing, name calling, and repeating the same tiresome stuff.

Good comments, Dick.

For further information, EVERYONE should read the ALL of the NGS "Genealogical Standards and Guidelines," (links at the bottom of middle column) at:

http://www.ngsgenealogy.org/

It includes specific guidelines about sharing information with others and also about posting online.

My comments don't apply to copyright issues, but rather to SOURCING. Anyone can copy my genealogy research, but they should at least footnote it, right? I don't appreciate all of the people who have my original narrative stories (including my typos) uploaded to Ancestry.com and WFT without any source attribution. They are copying material written 20+ years ago, back when we shared hard copies of information. No one asked my permission to upload it in his GEDCOMs or WFT, without. It's just plain unethical, but, charitably, these people probably don't know any better.

All of need to encourage each other to constantly educate ourselves!

As usual, the point to the massive objections to the IBC were overlooked.
It isn't the fact that data was posted and is not copywriteable, or in the public domain, or anything else.
No one objects to a search engine. This Collection was not a search engine result, other than in searching the index or their use of a search engine to find the locations. It was a result of years of grabbing from the net, framing within ancestry and displaying content as if it were their own content.

Obviously, the powers that be within TGN re-thought, consulted and otherwise mulled over their presentation. Had they thought it ethical, legal and otherwise just fine and dandy to do this, in this fashion, ( probably with some legal advice too) they would have left it up.
I don't believe for one minute that it won't re-appear in some other fashion designed to soften the impact.
As far as copyright goes, yes, certain facts that are in the public domain are not copywriteable, but a display of data from family bibles is not public domain,this data is not available anywhere other than than th eoriginal source and does not become public domain upon display, for just one example.
Displaying entire cached pages as your own was and is a violation.

So are you saying it's okay for us to put the data Ancestry is putting online on our sites since copyright laws are fuzzy?

Dick Eastman pretended to quote:

---> So under U.S. Law nothing on Ancestry.com is copyrighted?

Disingenious, Dick, to quote only one part and ignore the rest.
The full quote:

> But Ancestry.com is all facts....
> So under U.S. Law nothing on Ancestry.com is copyrighted?
> We all need just one subscription and can repost all of it anywhere we like for those who don't wanna pay their outrageous fees?

Dick scrambles to eat his own words:
> No, no, no, no. I never wrote that or inferred that. Do not focus on one statement and then try to apply it to an entire site.

> For further reading, start reading the laws involving COMPILATION copyrights.
Oh, so you mean Ancestry.com has compilation rights, but I do not? It is ok for Ancestry.com to copy me, but not ok for me to copy them? Huh?
Why do you keep saying we must lay down and take whatever Ancestry.com does, but can't do the same to do them?
Why the two different measuring sticks?

> Next, you must read the licenses of each online site in question
Uh... Dick, I don't know what you have been smoking, but whatever Ancestry.com's license tell you, they do not own the copyright on my grandfather's birth certificate.... their license matters nothing.
No copyright is no copyright. Every first-year law student knows that.
Funny that you would try to contradict that...
It actually IS okay to repost almost every record on Ancestry.com wherever you like, whatever they and you say.

---> It actually IS okay to repost almost every record on Ancestry.com wherever you like, whatever they and you say.

As I wrote earlier, "...the dividing line between copyrighted information and public domain information is often fuzzy. Even legal experts who specialize in intellectual property issues often disagree with each other. You should realize that not everyone is going to agree with your interpretation of the legal issues involved."

Sadly Sir, Some of the info they posted never was on a website where it could be viewed or used, such as my personal PRIVATE unpublished phone number (no one has this number) or parents names, not to mention my niece's name and birthdate.
And some of the info they posted was stuff that had been corrected after they stole it. No this is more than indexing this is stealing and not checking the facts or requesting permission to use it or even referring to the original site where corrections can be made.
And I'm sorry but a genealogy search engine doesn't need to go to the illegal sites which most of us wouldn't let our children visit. When I was searching the TGN engine I got waylaid many times.

The main issues in this whole mess are focused on the simple process of asking before you take. Ancestry being the Big Hungry Dog, shoves their way around, grabbing everything and treating individuals as if they were insignificant unpaid volunteer transcriptionists.

There is no "for hire" relationship between the creators of the accumulated (COMPILED) data and The Genealogy Network.

The closest you can come is the fact that some of the pilferred data was on ancestry's owned servers, in the accounts of people Ancestry's parent corp, donated server space to. That doesn't automatically become a quid pro pro of 'donating' the content to Ancestry, no matter what their AUP states.

Ancestry isn't even 'compiling" they are serving the data and content "as is" except removing the source and identifying information, including the copyright notices.

Copyrights were designed to protect intellectual creative works, and believe me a majority of compiled data takes a lot of creativity to present. That is why it is 'intellectual' property.

It is like a property owner renting out an apartment, and coming by in the middle of the night, and stealing the contents, because they own the building.


In most cases however there wasn't even that much implied contact, The thefts were drive-by, purse snatches, and hit and run.

"If you are willing to have some search engines index and cache your site but do not want all search engines to do so, you can be selective. "

Dick the problem is not coding to stop caching since nobody knew Ancestry had a bot spidering their sites they could not know to add their bot to their sites.

Ancestry said "we did this in case the site was ever removed".

Ancestry has nothing but facts and if we used your line of thought even the Harry Potter books could be used. Intellectual property is protected, original works are protected even if they do conatain some facts.

I think the National Genealogy Society "Standards" should be used by all and if Ancestry wants to do business in the genealogy world they need to be a good genealogy citizen. I read recently where Ancestry demanded a copy of a census report be removed since they hold the copyright to the image. Do you suppose the person could have said he was caching the image in the event Ancestry goes belly up or removes the records. I don't think Ancestry would buy that reasoning.

There are so many records needed that has never been published and Ancestry has the man power to get them online and they don't need to stoop to harvesting.

MAD

Since there are some people who don't understand, I think I may be able to clarify. Ancestry is by no means all facts. Any facts you find on Ancestry are public domain. John Smith born April 1, 1852 in Whatsamata Texas. Married Sarah Rubble Jan 29, 1900 ... these are facts. (made up facts, but facts nonetheless) The scanned image of a census document? Not a fact. Ancestry's presentation of facts in their record format? Not a fact. Layout and design are all copyrightable.

So, yes, use the facts you find at Ancestry at will. (I'm not a lawyer, but I think I am correct here.) Just don't use screen pictures or documents you downloaded from the site. That's where the fuzziness begins. (From what I've read, some websites have been told by Ancestry lawyers to remove census images relating to famous people. They might not go after you if you post documents relating to your own family, but they probably can. Scanning is likely treated the same as photography. Sure, someone else can take a very similar photo/scan...so you are welcome to find the original and photograph/scan it...but others' photographs/scans are still copyrighted.)

Whether or not you can quote a document depends upon several factors. I would refrain from quoting verbatim any non-governmental post-1900 document.

Craig Manson has begun to post a legal analysis of the issues surrounding Ancestry's Internet Biographical Collection on his blog at www.geneablogie.blogspot.com. His first installment was posted on 08 Sep 2007.

Having read Craig's blog for some time, and knowing something about Craig's legal expertise, I expect Craig will provide an objective, well-researched analysis to complement the opinions already expressed on this issue.

Thanks Steve for the link to Craig Manson's legal analysis. Very interesting reading indeed and am looking forward to reading more by him on the subject.

I'm an amateur genealogist, and if someone copies my data that's ok with me (although it would be nice if they gave me credit for the personal add-ons.) If I made a mistake, and they copy that too, sorry - that's now their problem no longer mine. I would hope that anyone that discovered my mistake would let me know (and it has happened). That, of course, would only happen if the source (my site) was cited in the copy.

If, however, I was a professional, and the information I had is/was being sold to a customer, I'd have to be nuts to post it on-line unless the customer said to do it.

Dick

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Receive FREE daily newsletter updates by email

  • Enter your email address


    Click here to see a typical e-mail message you will receive.

    I promise that:

    1. I will never sell, rent, or give away your address to any outside party, ever;
    2. I will never send you any unrequested e-mail, besides newsletter updates; and
    3. All unsubscribe requests are honored immediately, period.

My Photo

Search This Site for Past Articles

Meet Dick Eastman in Person

  • Sept. 2 to 5, 2009 - FGS National Conference - Little Rock, AR

    Sept. 26, 2009 - Maine Genealogical Society Annual Conference - Bangor, Maine

    Feb. 13, 2010 - Pinellas Genealogical Society - Largo, Florida

    Feb. 26 to 28, 2010 - Who Do You Think You Are? LIVE! - London, England

    March 27, 2010 - Clayton Library - Houston, TX

    April 10, 2010 - Indiana Genealogical Society (IGS) Annual Conference - Ft. Wayne, IN

July 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Amazon Kindle

Offers

Blog powered by TypePad

Amazon Picks

Receive daily newsletter updates by email

  • Enter your Email


    Preview

    (Don't worry, I hate spam as much as you do and you will be able to UNSUBSCRIBE within seconds at any time!)