In recent months several genealogy information providers have offered major online specials. The specials have usually been in the form of free access to a web site or part of the web site for a few days or perhaps even as long as a month. Both Footnote and WorldVitalRecords.com have offered free access in recent days, and other genealogy providers have done the same in past months.
These "online specials" have proven to be very popular and have attracted thousands of potential new customers. However, the same online events have also suffered significant technical difficulties in the first day or two or even three days. Message boards, including the comments section of this newsletter, often "light up" with messages from would-be users expressing frustrations when things do not perform perfectly. Some messages insinuate that the providing company is misleading people or using "bait and switch,” which is absolutely not true. Some messages have stated, "It doesn't work so I will never go back again."
Anyone who has worked in data centers understands the reasons for such problems, but many others do not. I have been working in data centers for more years than I care to mention, so I thought I would offer an observation or two.
In short, the problem centers around load management. Let's say that an online provider has been in operation for months or years. Over that time, they have fine-tuned their servers and databases to provide reasonably fast response times and to perform error-free under almost all conditions ever encountered. The systems provided by these companies have been fine-tuned to handle several hundred simultaneous users.
The management of the company then decides to provide a "free access period" to show off the available services. This attractive offer is obviously going to attract a lot more users for a few days. The number of simultaneous users will multiply by a factor of 10 or 50 or perhaps even by 100. Nobody knows what the new system load will be, they can only guess.
The company's technical personnel realize that they will now see an increase to thousands of simultaneous users, perhaps tens of thousands.
The support personnel responsible for fine-tuning and maintaining the servers and databases typically fly into action to create a plan to handle the increased load. Additional servers are often installed in advance of the planned launch and even more high-speed Internet connections and routers may be brought online. However, all of this is GUESSING at what the future bottlenecks will be. Adding hardware is only part of the solution: “tuning” and other adjustments also need to be made. Accurate fine-tuning of servers and databases can only be performed when a real load is in progress; there is no method of measuring the load and making needed adjustments until the actual load occurs. Anything done in advance to handle the increased load is based on guesses, hunches, and perhaps on experience gained "the last time." All of this is very unscientific, but there is no other option.
The typical result is that, on the morning of a new offer, the company's support personnel nervously monitor system performance on a minute-by-minute basis. If problems occur - and they always do - adjustments are made. Some adjustments can be made within minutes, but others may require hours or days. If new hardware is needed unexpectedly, the delay may be weeks. Servers, disk arrays, and routers are typically manufactured to order; most manufacturers do not keep them "on the shelf" and ready for shipment on a moment's notice. If you need additional routers or servers or disk arrays, you may have to wait several weeks for the order to be filled and shipped. Once the hardware arrives at the data center, it needs to be installed and configured, which may result in even more delays.
The impact to customers is predictable. When tens of thousands of users suddenly start using the systems that are not tuned for such a load, performance suffers. Response times slow to a crawl, error messages may appear most anywhere, and sometimes the various screens do not display correctly. I have seen instances where free offers simply didn't work as the load escalated.
Anyone who has been through one or two of these huge increases in workload understands the reasons why and is probably willing to give up today and to return a day later or a few days later, after the load subsides a bit and after systems personnel have had a chance to make adjustments. Users without data center experience typically are not as understanding. They expect things to work perfectly immediately when the new offer becomes available.
The biggest workload on the servers is typically the first twenty-four hours when a new offer is launched. If an announcement is placed in this newsletter and in other online genealogy newsletters, blogs, and message boards, everyone rushes to the web site to try it out as soon as the new offer is available. Of course, the result is predictable: errors, crashes, and slow response.
As the days go by, the workload usually subsides significantly and systems personnel make needed adjustments. System performance almost always improves. Things that fail on the first day usually work well by the second or third day.
I have a bit of advice: if you try a new service or a new "free offer" on the first day it is available, and if things do not perform perfectly, please do not go to message boards and post messages expressing your frustration. Keep in mind that these problems are predictable and have been encountered by thousands of other service providers in the past. These problems are usually short lived. I'd suggest you simply find something else to do for the moment and then return to the site in a day or two or even later. You will probably have a totally different experience once the load subsides a bit and after systems personnel have made adjustments based on their new "real world" experience.
If you do read any reports anywhere online of difficulties with a new offer or a new service, and if that report was written in the first twenty-four hours, I would suggest you disregard the report. It was written by someone who doesn't understand the issues of load management. Go to the site a day or two later and see for yourself.
The above information applies to genealogy web sites as well as to most all other online services.