Posted: 1357491596000
My local paper has a popular feature called "Five Myths about . . . " in which commentators provide insights about common assumptions on a variety of topics. Having read through these message boards for a number of months, I've seen some recurring issues that seem to be related to various assumptions about how DNA analysis should work. I've provided my thoughts below. I'm far from expert, and would welcome corrections or elaborations on these and other "myths" associated with Ancestry DNA.


1. My ethnicity report should match my family tree.

Not necessarily. Most family trees go back a few generations or at most a few hundred years. The ethnicity report is based on the analysis of genetic markers that may go back hundreds or even thousands of years. Moreover, the ethnicity labels do not necessarily reflect modern national borders or cultural backgrounds; they are names given to particular reference populations by the scientists who conducted the studies and some are quite arbitrary. The ethnicity analyses are fairly reliable on a continental basis (Europe, Asia, Africa) but much less precise for regions or countries, especially for those where there has been a great deal of mixing among groups over the centuries, such as Western Europe.

2. My matches should have the same ethnicity that I do.

Because recombination of DNA is random, and the shared segments diminish with each generation, it is quite possible to have a close match with a person whose ethnicity is very different. Consider, for example, a distant cousin who descends from a common relative, but whose family members in each subsequent generation intermarried with individuals from different regions and cultures. By the time you each get to the current generation the newly introduced ethnicities could very well overwhelm that of your common ancestor. Another possibility is the one or both of you have a “Non Parental Event” (NPE) in your line. This may be an adoption, orphaning, fostering, out of wedlock birth or other event that caused a mismatch between a person’s genetic background and surname.

3. If there are no shared surnames in our trees, my match is false.

The surname list includes only direct ancestors’ surnames out to 10 generations. If your match is on a collateral line, those surnames will not be included. Also, the match may be before either you or the other individual has documentation, or there may have been an NPE, as discussed above. It is always important to examine the actual trees, including both individual names and locations, rather than relying only on the surname list.

4. Ancestry family trees are full of errors and useless for understanding DNA results.

It is true that many trees have incorrect entries or lack documentation. This is especially true when the authors have done “name harvesting,” indiscriminately copying entries into their trees without investigation. However, diligent research even in these trees can lead back to original sources that may be valuable in resolving mysteries. Moreover, communicating with the authors of such trees, and sharing information about DNA results, may result in a correction that will benefit both of you and other future researchers.

5. Once provides our raw data these problems will be resolved.

Raw data refers to the actual values associated with each location that was tested, usually made available in a large table or spreadsheet. In itself, this array of data is not particularly useful except to people who are expert in reading the entries. The value will come if the data are provided in a form that will facilitate further research. For example, with appropriate analytical tools (some of which are available currently in various online sites) users may be able to identify the exact chromosomes and segments on which matches occur and independently evaluate the reliability of their matches and identify others who share those segments. They may also be able to use other admixture (ethnicity) calculators and compare the results with those provided by Ancestry. Finally, if the format of the data permits, it may be uploaded sites where people who have tested with other companies have posted their results, thereby expanding the pool of potential matches. Merely having the raw data, however, will not necessarily provide more information and understanding unless it is accompanied by the ability to do more analysis than is currently possible using’s tool kit.
