The newest technology these days in computers is called “cloud computing.” Indeed, we already see several examples of this in today's genealogy software and I am certain we will see even more within the next two or three years.
Cloud computing refers to Internet-based software and databases. The Internet itself is “the cloud.” In almost all drawings of Internet applications, the Internet is shown as a “cloud” into which various computers are connected. The cloud is used as a graphic to represent all sorts of servers, routers, and high-speed connections that are invisible to the user. In short, the user does not need to know where the equipment is located nor what kind of equipment is used. All the user needs to know is how to connect to “the cloud” and access the resources available. “Cloud computing” is simply the next evolution of remote computing.
In order to establish a comparison, let's first look at traditional genealogy software as used in personal computers for the past thirty years or so. Thirty years ago, most genealogy programs for personal computers stored the software and the data on floppy disks or on cassette magnetic tapes. The technology has since switched to hard drives and perhaps even to “jump drives” but the basics remain the same: the software and the data is still stored on disk drives in the local computer.
This method has worked well for millions of genealogists but does have several disadvantages. The biggest issue, in my opinion, is that each person's database remains as a separate, isolated “island” of information. Data that is already on your personal computer may be manually duplicated by a distant relative who lives up the street or across the continent. There is no easy method of comparing data to see if someone else has already solved the mysteries you are dealing with.
Another disadvantage is that each person is responsible for maintaining his or her own data and software installation. When a software vendor releases an update to a genealogy program, each and every one of that vendor's customers need to install the update themselves. That's a trivial exercise for anyone who is familiar with computer operations but can be daunting for those with less expertise. Even worse, each and every computer user is responsible for performing backups of the critical data. Many people simply do not perform backups. We have all heard stories of genealogists who have lost years of research material because of a computer malfunction with no backups available.
Cloud computing solves these issues in a manner that requires little technical expertise. This new technology introduces a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. Users do not need to have knowledge of, expertise in, or control over the technology infrastructure "in the cloud" that supports them. Software upgrades, backups, and other administrative tasks are handled remotely by systems engineers who possess the proper expertise. All this is performed with little or no intervention by the end user.
Cloud computing may be free or sometimes is offered at moderate costs. It is almost always cheaper than purchasing software, purchasing upgrades, performing backups, and possibly purchasing new, larger disk drives.
Cloud computing uses many new buzzwords, such as infrastructure as a service (IaaS), platform as a service (PaaS) and software as a service (SaaS) as well as Web 2.0 and other terms that may not yet be familiar to everyone. The concept of all of these terms is simple, however: someone else takes care of the hardware, the software, and the infrastructure. The end user is shielded from the complexities involved.
Examples of software as a service (SaaS) vendors include Salesforce.com and Google Apps which provide common business applications online that are accessed from a web browser, while the software and data are stored on remote servers. New SaaS applications seem to appear daily.
My favorite cloud computing vendor is Zoho.com which offers a plethora of online applications, some free and others are available for a small fee. Zoho's offerings include a word processor (which is better than Google Documents in my opinion), a spreadsheet (which is better than Google's cloud-based spreadsheet in almost everyone's opinion), a web-based e-mail service, a presentation program (similar to PowerPoint), an online notetaker, a wiki, a sharing space for documents and photos, an online planning organizer, an online calendar, a CRM package, a database program, and much more.
In cloud computing or SaaS (I will use those two terms interchangeably), the data is stored on a web server someplace. The end user probably does not even know where the server is located nor should he or she care. It is simply “someplace on the Internet.” The user connects to the Internet, enters a web address, and is connected.
The genealogy software itself might be located someplace on the Internet or possibly installed on the user's local hard drive. I will discuss examples of both later in this article.
All of this is not new in the genealogy world. In fact, SaaS has been around in genealogy applications for nine years, long before the term “SaaS” entered the computer dictionaries.
The First Cloud Computing Genealogy Products
OneGreatFamily.com launched an online service in the summer of 2000. Like many new products or services, OneGreatFamily.com was a bit primitive in 2000 when compared to today's offerings. Indeed, OneGreatFamily.com has added a lot of new functionality in the past nine years. However, the basic concept remains the same: one central database is stored on a central Web server and is accessed remotely by all of OneGreatFamily.com's customers. The required software is also stored on the company's servers and is then downloaded to the customer's computer as needed. Any new software updates are installed on the customer's servers and then are automatically downloaded and installed on the customer's computers the next time they log in. No manual intervention is required by the customer.
OneGreatFamily.com's application dictates that all users share the same database. In fact, as you enter more and more information about your ancestors, going backwards in time, the odds are greatly increased that you will find records about those ancestors already entered by someone else. OneGreatFamily.com is great at helping newcomers find additional ancestors! This can be either a good thing or a bad thing, depending upon one's viewpoint.
Indeed, some of the data within that database may be wrong but such errors can be identified and easily corrected by appending additional information onto the existing record. The old data is never deleted but anyone may offer a “second opinion,” complete with source citations. A third genealogist who later finds the record in question sees both opinions and the accompanying evidence, if any.
OneGreatFamily.com costs $75 a year and must be renewed every year.
Another online genealogy application appeared a few years later as a true SaaS implementation. FamilyTreeExplorer.com (formerly known as PedigreeSoft.com), is a complete software and database package that works equally well on Windows, Macintosh, and Linux systems. Rather than sharing one huge database amongst all customers, FamilyTreeExplorer.com gives each customer his or her own, personal database. In use, FamilyTreeExplorer.com operates in much the same manner as most genealogy programs of the past twenty years or more. The primary difference is that data and software is stored on a remote server someplace on the Internet, not on the user's local hard drive. The end user may grant read-only access or even read/write access to others, if desired. However, all data remains under the control of the person who first became a FamilyTreeExplorer.com customer and created the initial account.
The FamilyTreeExplorer.com software is also stored on the centralized server but is downloaded to the end user's computer as needed. Again, any software updates are performed by the company's systems personnel and then is automatically downloaded to each user's computer as needed. No intervention is required by the end user. All data is backed up centrally on a regular basis.
The basic FamilyTreeExplorer.com service is available free of charge but the company plans to charge fees for advanced services as they become available.
Newer Cloud Computing Genealogy Products
Two newer services have appeared in recent years that vary somewhat from the “software as a service” business model. The Next Generation and PhpGedView are two web-based genealogy applications that run on web servers and support one or more simultaneous users who connect via the web. What is different with these two is that there is no central company installing the software onto web servers and managing the databases. Each customer must either install the software onto a web server or arrange for someone else to do that for him. Several hosting companies will be glad to install the software for you onto their servers for prices as low as $5 a month.
Once installed, each program accesses its own database, typically one that is stored on the same web server. These programs excel at multi-user capabilities and are very popular with “group projects” in which a family society or an informal group of distant cousins are working together on genealogy research projects.
Most of the software as a service products that I have seen do not yet have all the functionality of the better free-standing genealogy programs. Most are limited in terms of reports, in search capabilities, or in the many other options typically found in computer software.
PhpGedView is available free of charge while The Next Generation costs a modest $29. Of course, you will also need system-level access to a web server in order to use either product.
Undoubtedly the largest project of all is the NewFamilySearch implemented by the Church of Latter-day Saints of Jesus Christ (the Mormons). This project is sill in beta and is still evolving, but thousands of people are using it every day. NewFamilySearch is a true software as a service offering: the end user opens a web browser, logs onto the site, and performs all functions in that web browser. NewFamilySearch is the replacement for the Mormon Church's aging Personal Ancestral File software and is planned to eventually perform all the functions of that program and more.
Best of all, NewFamilySearch accesses a single, shared database that contains family information and stories about millions of deceased individuals. Indeed, NewFamilySearch is a true “cloud computing” application and does appear to be the wave of the future.
The software developers at NewFamilySearch have also included application programming interfaces (APIs) that allow other genealogy programs to access the same data. In theory, any Windows or Macintosh or Linux program running in a computer that is connected to the Internet can access the data in NewFamilySearch. In fact, even a program running on another web server could access the data on the NewFamilySearch servers. Of course, that assumes that the software developers who created the other programs have properly written software that utilizes the APIs.
NewFamilySearch is presently in beta test and does not yet have all the features of other genealogy products. Access to the product during beta test is available only to members of the Church of Jesus Christ of Latter-day Saints and to several thousand other invited guests. NewFamilySearch will always be available free of charge.
The Newest Cloud Computing Genealogy Products
The newest offerings defy the concept of “software as a service.” Perhaps we should coin a new phrase: “Data as a Service.”
Both Ancestral Quest version 12 and RootsMagic version 4 are Windows programs that are installed in the traditional manner in desktop and laptop computers. Both programs maintain their own databases on the computer's local hard drive(s). However, both programs also have the capability to access data stored on the NewFamilySearch web site. Data may easily be copied in either direction: if you find new information on NewFamilySearch about previously-unknown ancestors, you may copy that data directly into the local databases of either Ancestral Quest or RootsMagic. Likewise, if you have information in your local database that does not yet appear in the central NewFamilySearch database, you may add part or all of the information you have about those individuals with a few mouseclicks.
Ancestral Quest and RootsMagic now combine the best of both worlds: local databases, high-powered genealogy software with a plethora of reports and other advanced features, and yet also offer read/write access to the huge database of NewFamilySearch.
The Future of Genealogy Cloud Computing
I am not aware of any available Macintosh or Linux programs that access the APIs of NewFamilySearch. I suspect that such programs will be available within a year or two, however.
Software as a service is spreading throughout the genealogy world in more or less the same manner as it does in other applications. Whether discussing Google Apps, Zoho.com, Salesforce.com, or NewFamilySearch, the concept of cloud computing makes sense for thousands of computer users who either cannot or prefer not to perform software upgrades and database backups. Use of cloud computing allows each user to focus on the data. I'd suggest that is a good thing.
The future of cloud computing looks rosy, both in genealogy and elsewhere.
References:
FamilyTreeExplorer.com (formerly known as PedigreeSoft.com) may be found at http://www.FamilyTreeExplorer.com
The Next Generation may be found at: http://lythgoes.net/genealogy/software.php
PhpGedView may be found at http://www.phpgedview.net
NewFamilySearch may be found at: https://new.familysearch.org/en/action/unsec/welcome
Ancestral Quest may be found at: http://www.ancquest.com
RootsMagic may be found at http://www.rootsmagic.com
I'm waiting for the day that adding to any of these databases REQUIRES source information! Until that day, too much junk is being added. I would love to share my information and invite others to share with me but without standards that require credible documentation, I don't want anyone adding to or changing what I have worked hard to verify.
Posted by: Shirley Fields | March 28, 2009 at 07:56 AM
I'm with Shirley Fields on this. There's far too much "junk genealogy" out there already. Information without sources is useless and I see no reason to pay for it.
Posted by: Duncan Ness | March 28, 2009 at 09:16 AM
I have been contacted by a researcher via ancestry.com about our shared roots. I was invited to her site and enjoyed the photos that I had never seen. Then I began to notice that photos, whose originals I have, were mislabeled. She has been "grabbing" and including materials from other researchers of the same surname. The problem is that families sent their group photos to other siblings, cousins. The photos were saved, unlabeled, and descendents of the savers then sought to identify them as their own direct ancestors.
Yes, there is so much "junk" out there that I prefer to keep my own research as uncluttered as possible. I will share a CD with others of scans of documents and photos that I have and I will attempt to notify them of errors in the hope they make the necessary corrections to their information. I will not upload my information and get it blended with other peoples' quickly-assembled genealogy.
Posted by: Margaret | March 28, 2009 at 09:56 AM
I meant to add and should have begun my post with:
Thanks, Dick, for a very informative article on the subject! You have the gift of making a confusing subject clear to non-technical folks.
Posted by: Margaret | March 28, 2009 at 10:02 AM
Dick, another thanks for pulling all of this info together in a non-techie format.
I have gone in the direction of having my own website and putting The Next Generation
software on my site so that I can control the information and still share it with fellow
family researchers. Unfortunately, TNG has not been user friendly for me. I had to have
a techie friend set me up. And I am still having trouble uploading info because for me
it has not been user friendly.
But I did see your review of the TNG handbook which I will have to buy to figure out what
I am trying to do.
Honey Lanham
Posted by: hslanham | March 28, 2009 at 11:31 AM
Hi Dick,
Interesting and useful, as usual. My personal favourite (not listed above) is Geni.com, which is purely cloud-based (no local client). It is also free, although you can pay for some enhanced features/services. It's not perfect, but it is certainly part of this wave, and is doing a remarkably good job, IMO.
Posted by: Steve Knowles | March 28, 2009 at 01:05 PM
>I am not aware of any available Macintosh or Linux programs that access the APIs of NewFamilySearch. I suspect that such programs will be available within a year or two, however.
Ohana Software have a product called "Family Insight" that work on Macintosh and can interface with New Family Search.
http://www.ohanasoftware.com/?sec=learnmore/familyinsight
I got the trial version installed on my Mac Pro running Mac OS X 10.5.6 and it works to search the IGI. I haven't used it for any of its other capabilities, like editing and saving information - I just fed it a GEDCOM file of my family exported from Reunion. Here's a screen shot of it working
http://lisaandroger.com/images/FamilyInsight.png
I don't have access to the actual New Family Search since I'm not a member of the Church of Jesus Christ of Latter Day Saints, so can't tell how it works with that aspect of it.
Roger
Posted by: theKiwi | March 28, 2009 at 01:07 PM
I would say that cloud computing is in the stage of where genealogy was when computers were introduced to genealogy many years ago. It will evolve into a user friendly platform with all the bells and whistles. I use a php program on the internet and am very happy with it. I have complete control over what goes on the site and can enter sources and pictures, histories, and the integrity of my research is persevered A php site is like have your genealogy program on the internet; any where in the world I can log on and enter more information. I must admit that it needs better functionality and I'm sure as time goes by it will improve. Google has indexed my database.
I also put sourceless gedcoms on various sites. This is really a good way to contact people. It has to be remembered that everything out there should be verified with records. I share records here and there and I get records in return thereby building a more reliable genealogy for little cost. I think people want to be correct. If no correct gedcom it out there then no one will know any different. It is better to be proactive.
Posted by: Don Jaggi | March 28, 2009 at 01:23 PM
Dick,
Good article on an important subject. I hope you will talk with some of the other genealogy software vendors such as Family Tree Maker, Master Genealogist, Legacy Family Tree, etc to get their perspective and plans in this important area. Sophisticated functionality must be maintained no matter where the software resides.
Posted by: Bob Voorhees | March 28, 2009 at 01:28 PM
We all use different genealogy programs. Am I one day going to go to the library and login and see my exact program and database I have at home and know that my database is preserved in the cloud.
Posted by: Don Jaggi | March 28, 2009 at 01:33 PM
My thanks also Dick for this article. I share the previous commentators reluctance to use undocumented data or expose my own hard won data to corruption.I am a long time member of Ancestry.com but its Family Trees I view as a liability to researchers rather than an asset. Surely, it could not be difficult to block submission of data lacking sources? That would of course include the "reference" and "source" that I find most irritating "Family trees"
Posted by: Gwen M.McCullagh | March 28, 2009 at 02:23 PM
Terrific article, Dick. I agree with the other comments, though. The family that I am researching is mostly in the UK, but I am the main researcher here in Canada. I love sharing with them, but distance makes it awkward. Also, frequently I find incomplete information with no or incorrect references, or just wrong stuff. Often they are just memories, and we all have our own perceptions. I'm not implying anything malicious -- just errors. I've become the 'prove it' person.
In an ideal world what I would like is something like cloud computing, but with a restricted number of administrators. Invited people only could tentatively add stuff, and the administrator would be immediately notified. Other invited family members would be able to comment on it. The addition would not be confirmed in the database until the administrator verified it. If there is argument, there could be a separate family forum for that restricted to family members or invited guests. I guess I'm describing some kind of family wiki, but I'm not savvy enough to be sure.
It would also have to be pretty technologically simple and user friendly. I'm not of the MySpace generation. I still use books. A lot of us are the same.
I'm very fussy that every fact should have a source and citation, and I don't want to lose control over the database that I've worked so hard on. For this family anyway, they are happy to have me do all the hard work and they chip in now and again, so this way works for them, too. I suspect many other families work that way too.
Posted by: Penny Holt | March 28, 2009 at 02:46 PM
While I agree with the need for more accuracy, I am not sure how that ties in with cloud computing. It strikes me as two separate issues.
Accuracy has always been a problem in genealogy. Go into any well-equipped genealogy library and pull a few family history books off the shelf at random and examine the information closely. Some of them were published 100 years ago, others are much more recent. However, almost all of them have glaring errors. The quality obviously varies from book to book, depending upon how dedicated the author was to quality. But a high percentage of these old books had major fairy tales in them. There are tales of castles and knighthood and descent from kings and coats of arms and all that stuff. That might be accurate in a few cases but not for all the books that have been published with those claims.
Most genealogy books also have errors in the various families listed: women giving birth at the age of three, men marrying women seventy years older than themselves and then raising families, etc. For evidence, I will offer the thick book on my family name. It is probably 98% accurate but that still leaves a few hundred errors.
These are appalling, of course, but have nothing to do with cloud computing. The new technologies simply we all enjoy today still allow sloppy genealogists to continue publishing fairy tales in the manner they always have.
So let's be distinct: is it a problem with cloud computing? or is it a problem with human beings?
- Dick Eastman
Posted by: Dick Eastman | March 28, 2009 at 03:19 PM
Dick,
Until, and unless, everyone using the database adheres to the same standards of documentation and evaluation of genealogical information (read: my standards ...), I'll keep my data on my HD, where it belongs. There is nothing wrong with the technology; it's the use that troubles me. Too many people are simply ancestor gatherers, and that's not my style.
Posted by: Lawrence Bouett | March 28, 2009 at 03:39 PM
The concept of cloud computing is a thing that has arrived for many applications but I doubt if it is going to be hugely successful for genealogy. Unfortunately we have all seen people grab work others have done and republish it as their own. Just look at RootsWeb Family Trees. Some indicate that they have gotten the material from someone's GEDCOM but many/most do not.
The major problem is that errors get embedded and there is no way to track them back much less get them corrected.
Can you imagine thousands of people working on the same file all arguing about who is right or simply changing the data because that is what they found on RootsWeb? Sounds like chaos to me!
I post pictures on my personal web site but I'm about to delete them the next time I update it as there is no way I know that I can lock out their being pirated.
I like my personal approach which is to carry my genealogy files (including documentation) with me at all times on a flash drive. I also carry an operating copy of my genealogy software. This way I can work on my netbook in a library and move the flash drive to my desktop when I get home with no problems in synchronization.
Posted by: Doug | March 28, 2009 at 07:28 PM
The concept of cloud computing is a thing that has arrived for many applications but I doubt if it is going to be hugely successful for genealogy. Unfortunately we have all seen people grab work others have done and republish it as their own. Just look at RootsWeb Family Trees. Some indicate that they have gotten the material from someone's GEDCOM but many/most do not.
The major problem is that errors get embedded and there is no way to track them back much less get them corrected.
Can you imagine thousands of people working on the same file all arguing about who is right or simply changing the data because that is what they found on RootsWeb? Sounds like chaos to me!
I post pictures on my personal web site but I'm about to delete them the next time I update it as there is no way I know that I can lock out their being pirated.
I like my personal approach which is to carry my genealogy files (including documentation) with me at all times on a flash drive. I also carry an operating copy of my genealogy software. This way I can work on my netbook in a library and move the flash drive to my desktop when I get home with no problems in synchronization.
Posted by: Doug | March 28, 2009 at 07:31 PM
Doug, I have thought about an ideal solution that's somewhat similar to our Social Security numbers: give every documented and verified ancestor a permanent genealogical data number. It's like your SSN, it stays with you until death and no one else should use it in the future, it's permanent. The number is traceable to you and only you.
So giving each of your documented and verified ancestors a GDN is a good way of tracking all the documented/verified ancestors that born, lived and died on Earth (even unnamed/named children that were born and died within minutes). Since these ancestors would have GDNs, no one else would be able to control or manipulate the information or for any inappropriate reason (like lying, making up ancestral claims, deceiving others). A public federal agency on the matter of genealogies could do this BUT here's the catch: only citizens, genealogists and librarian scientists can work together on this (it runs like a public library and open to public).
If you have ancestors dating back to 500 years, bring the GED file to a computer bank, all the names will be processed and matched with specific ancestors based on closest results (search algorithm technology has improved in leaps and bounds). But no GDNs will be given until each ancestor must match to a correct ancestor already available in the databank. Granted, there were many ancestors with similar names but other useful clues and sources, if you have any, could narrow them down.
If you have previously submitted to DNA ancestry tests, that will be included to give proven ancestors GDNs.
I'm still brainstorming on this idea. It's not 100% perfect. But giving each proven ancestor a GDN would goes a long way in reducing inaccuracy and false relationship to almost a thing of past.
Posted by: Rob | March 29, 2009 at 12:51 AM
I just attended the New England Archivists conference where we discussed Web 2.0 and concerns of authority, accuracy, and control.
One interesting "paradigm shift" I'm observing is the willingness of well-respected institutions (Library of Congress, MIT, Harvard, among others) to put archival materials out in Web 2.0 contexts (Flickr, Blogs, wikis, and so forth) and allow the greater community to add to, mark up, comment, and otherwise challenge traditional authority. Of course the institutions keep a "pure" and well-protected backup of all materials, but they are finding value in opening up traditional avenues of authority and "decentering" control and ownership of materials not under copyright.
One advantage is increased communication between institutions and much wider user communities. Another is gaining new information and credible corrections of long-held institutional assumptions, especially when sources are provided. Another is in allowing the user community to begin to "police" or correct itself as users respond to one another.
I want to keep a local copy of my TMG database intact and under my control, but I would not be averse to posting a copy in the cloud as long as my confidence levels in my own sources were also posted with it. I would be interested to see what was added, challenged, or enriched, and choose what I might add or change to my local data set.
The message at the conference was to try a few small projects with an open mind and see what positive results may emerge. For most formal archives, the most exciting aspect is that unlikely groups of users are finding, using, and yes, even repurposing their material (usually through RSS which feeds Web 2.0) as well as helping the owner gain new insights on old material from newly credible sources.
Thanks, Dick.
Posted by: Holly Hendricks | March 29, 2009 at 07:27 PM
Another aspect of cloud computing which I am pursuing on behalf of my state society, is the use of an on-line database for member records management. As with most societies, we own no hardware and depend on our ever rotating cadre of volunteer treasurers to come up with their own system of tracking member dues. Every time we change treasurers, the new one comes up with his/her own system, often a complex spreadsheet, that can't be passed on easily to the next treasurer who has less familiarity with spreadsheets. A web based system, with all the prompts and error checking that can be built in, is the ultimate solution to this problem and, with appropriate privileges granted to other officers, can be used for our mailing lists, member interests records, surname booklets and first Family Documents.
Posted by: Jim Anderson | March 29, 2009 at 08:05 PM
Dick, thank you for the lucid explanation. One problem with the 'cloud' is that servers crash, due to hardware, electrical-supply or software-conflict problems, and server-owners may withdraw services without notice (as happened to my knowledge with one TNG-based web site).
As noted by others, the problems with cloud Trees include the large quantity of erroneous data. This includes NewFamilySearch, which is mainly based on the greatly erroneous IGI, so it joins the ranks of such laughable entities as Ancestry.com's "OneWorldTree."
For the present, evidence-based cloud Trees depend on the integrity of the individual compiler. Those who call for 'sources' to be given ought to be aware that the vast majority of Trees giving sources at all describe them as copied gedcoms, emails, etc. - not evidence.
So are there any strictly evidence-based cloud Trees accessible to the public?
Posted by: Jade | March 30, 2009 at 09:29 AM
I agree that cloud computing is not guaranteed to preserve data forever. Just like every place else in the business world, new companies appear, old companies sometimes declare bankruptcy, some companies get bought out by others and merged, etc. The only thing different in cloud computing is that the host company can make backups that will help insure against hardware problems. Those backups do not protect against business problems, however.
As with any other application, it is important to make frequent backups of your data. That is true for cloud computing just as it is for any application installed in your desktop or laptop computer.
As to data integrity, I don't see any difference with cloud computing versus any other kind of computing. Data integrity has been a problem for genealogists for thousands of years and I suspect it will continue to be a problem for another one hundred years or so until we do get massive online databases that do contain verified information about every scrap of information about every human who ever lived and left records behind.
I don't expect to see that in my lifetime.
I don't see any difference in cloud computing when compared to any other kind of computing or even when compared to printed books. They ALL contain significant errors.
- Dick Eastman
Posted by: Dick Eastman | March 30, 2009 at 11:03 AM