Member Login
Username Password (Forgot?)
You are here: Learn > The Library > Columnists > Dick Eastman Online

Dick Eastman Online
2/14/2001 - Archive


LDS Family and Church History Department Adopts XML Standard
At a technical session of the GENTECH 2001 conference last week, Randy Bryson of The Church of Jesus Christ of Latter-day Saints announced that the Church is now standardizing on the XML programming language for all future software products. This announcement will have an immediate impact on producers of genealogy software and eventually will benefit all genealogists.

Mr. Bryson is the director of the FamilySearch Internet Genealogy Service for the LDS Family and Church History Department and also is the Information Technology manager over the Ancestral File, Resource Files, Research Guidance and Extraction applications. As such, he is responsible for compatibility among these products. The de facto data exchange standard for many years has been GEDCOM, a file format that is well-known for its imperfections. GEDCOM, an abbreviation for Genealogy Data COMmunications, was created by the LDS Church in the mid-1980s as a method of exchanging genealogy data between different programs. The specifications for GEDCOM file format have been updated a few times since then, and GEDCOM files have become the most common method of exchanging data between distant relatives. GEDCOM files also are used to contribute an individual’s data to the large, centralized databases of the LDS Church and other organizations.

In its first iteration, GEDCOM files consisted of ASCII text. Unlike binary files used by most other programs, you can open a GEDCOM file with a simple text editor and read the data contained therein. Later versions of GEDCOM were expanded to include ANSEL and Unicode, in addition to ASCII. Because of these updates, GEDCOM files can now handle umlauts and accents and other marks common in European alphabets. However, you can still read this data with a text editor, such as Windows Notepad.

GEDCOM has always suffered from numerous shortcomings, one limitation being the use of text. Other limitations have included difficulties with handling non-European names, handling imprecise data, and also the method of handling contradictory data such as we all find in genealogy research.

In the 1990s, two separate and exhaustive studies of exchanging data between genealogy programs were made. The two were conducted more or less simultaneously:

  1. One study was the GEDCOM Testbook Project, funded by GENTECH. The results of that project are called "GEDCOM Interchange Study Summary." The GENTECH effort later spun off a second, larger study, called the GENTECH Genealogical Data Model. While not dealing directly with the GEDCOM standard, it does address many issues that GEDCOM programmers need to be familiar with.
  2. The other study was conducted by the Family and Church History Department of the LDS Church. It resulted in the GEDCOM Future Directions document, published by the Family and Church History Department. [To view this document, click on the link above; then double-click the "Future" folder and open the Gedfmstr.pdf document—a PDF file that will require Adobe Acrobat Reader to read.]

The two studies were different in scope and purpose. The conclusions and recommendations of the two were also somewhat different although similar in some ways. It is interesting to note that the XML standard was mostly unknown at the time these studies began but came into prominence before the conclusion of these studies. While XML was not cited as a specific recommendation in either study, I have since heard the authors of both studies make reference to XML as a possible solution to some of the shortcomings of today’s methodologies.

XML is an abbreviation for "Extensible Markup Language," a programming language that has become very popular for applications that function on the World Wide Web. If you have made airline reservations online or purchased other goods from an online merchant, you have probably used an XML-based application without realizing it. A discussion of XML is beyond the scope of this article. For reference, I would suggest you start at XML.com or with any of the many good books on the topic available at your local bookstore.

I also should mention another alternative to GEDCOM’s shortcomings: Wholly Genes Software created GenBridge, a different method of directly transferring data between different databases that does not use GEDCOM at all. While Wholly Genes has had great success with GenBridge, other software producers have not yet adopted it.

Randy Bryson’s announcement of the adoption of XML illustrates the LDS Church’s concerns and plans. Obviously, the programmers at the Family and Church History Department have read these two studies and are proceeding with some of the recommendations. The introduction of XML will increase accuracy as well as allow for the use of non-European characters. A future release of the GEDCOM standard will be XML-based. The LDS databases will also accept XML data, databases such as the Ancestral File, Pedigree Resource File, International Genealogical Index, and others.

My guess is that the commercial Internet genealogy databases (Ancestry.com, Genealogy.com, OneGreatFamily.com, etc.) will also convert to XML input, perhaps even before the LDS Church completes its conversion. Obviously, all the genealogy programs used by individuals will also need to produce XML-formatted GEDCOM files in compliance with the new specification. I am sure we will see future versions of The Master Genealogist, Personal Ancestral File, Family Tree Maker, Family Origins, Legacy, and other genealogy programs that will produce XML files, once the new GEDCOM replacement format has been defined.

None of this exists today. Randy Bryson’s announcement simply indicates a future course. I suspect it will be two years or even longer before the new XML format is in place and in use. However, the benefits will justify the wait.

  • Read the next article in this issue.
  • Return to the Table of Contents.

  •   Printer Friendly
     
    E-mail to a friend

    Search The Library