New Problem: DNA Matches With Large Trees

Not really! We used to complain about tiny trees in this forum, so a few minutes ago, when I got a message from someone chastising me for having too large of a tree, I had to laugh.

This happened over at "another company" after I invited the person to share genomes. Here's one line from his reply that made me smile:

"Your tree is quite large, including siblings and their spouses - making it hard for others to find common ancestors."

This person seems very nice and polite, and they're also experienced with genetic genealogy (!), so I'm just amazed.
He even advised that, in addition to not including siblings, shared trees shouldn't go past the ggg-gp level. (FWIW, he seems to have viewed my tree here on Ancestry, so I can't even blame that other company's horrid GEDcom interface.)

You have to laugh...

The title of this post is meant to be a joke, but it does remind me of how a few people here have complained about huge trees as a likely sign of someone who just copies and pastes everything.

I admit that I've seen a few DNA matches with 40,000-person trees and instantly rolled my eyes. But really, that's unfair of me (at least until I actually spy missing/wonky sources).

After taking the test, I started working harder on adding more cousins, mostly from the past 120 years, and only those which are easy to document. (Ya gotta love the Texas birth and death online databases, esp. the actual images of death certificates - squee! - but I digress.) I've barely touched a smidge of one side of the family and still have added at least 1200 cousins in the past few months. That puts me at nearly 10,000 people, which is more than what 95% of my matches have, I'd say. Maybe people are now rolling their eyes at my number!

Anyway, that's today's PSA: "Don't hate on the huge trees!" :)

My practice is to ignore trees over about 3,000 entries for low or very low confidence matches unless I find a surname or location that I can relate to my family. It is helpful when cousins are added, but because the DNA match surname list contains only the direct line names it's necessary to go into the full tree to find them. That's just too time-consuming for the low probability matches, even though I may miss out on a few possibilities.

I automatically skip any trees that have Julius Caesar and Charlemagne as ancestors.


PS--my tree has about 7,000 names and it is deliberately exploratory, containing a number of unconfirmed entries. My "official" tree is on my hard drive and includes only verified relatives.

The smallest tree I have found a match for low and very low confidence/distant relationship matches was one with 600 people. And that one case was quite a coindidence as there was only one branch that went back very far, the one with the match. All others have had over 1000. For 7th and 8th cousin matches, the median number of persons in the tree, at least for my matches has been 3500.

I had one high level match to a tree with 170,000 names in it. It took minutes to load when I clicked on it. And was impossible to navigate. How can anyone work with such a large tree? I tried just searching the names for ones that were in my tree, but no luck. I don't know how else to process it - looking through the entire list of names is not an option.

But I have had 2 positive matches (but no leaf) to trees with between 30,000-40,000 names. Both matches are multiple common ancestor matches. In both cases, I only looked at the pedigree view to find the matches.

My smallest match was to a tree with 280 names. Just happened to include the right person.

If the leaf hints ever start working for all listed matches, maybe some of these large trees will become easier to research. For now, I mostly ignore them unless a common name (other than Smith, Johnson, etc) shows up.

I have two shared ancestor (aka green leaf hint matches) to tiny trees - one was 113 and the other 114. They were long on both lines. For awhile, i had my maternal tree at about 350 people, but two lines ended in 1850 with brick wall emigrants from Ireland. The several other lines were long and robust to colonial america, but those short lines kept the overall size small.

No hard fast rules on size of match and success although i tend to agree that there are suprisingly fewer successful matches with very large trees, i.e. 20,000 and above. I used to get excited when i would see one appear but less so now after opening so many matches and having less success with giant trees

Tiny tree with Shared Ancestor Hint:

"In both cases, I only looked at the pedigree view to find the matches."

Interesting because I usually end up on a pedigree view to make sense out of everything, yet every time I open a DNA match I'm presented with the "family" view. No big deal, but does anyone know if this is a built-in default that can be changed?

"I automatically skip any trees that have Julius Caesar and Charlemagne as ancestors."

Could you tell me your reasoning for this? I don't know anything about Julius Caesar, but many "British Isles to colonial Virginia" ethnicities have a Carolingian, Capetian, Plantagenet component. See this entry in Wikipedia (I know. Not always a terrific source).

I'm just wondering whether you think it's non-provable, or subject to too much corruption from "copy-and-paste"? Wouldn't it be possible that even though the final leg of the ancestry is wrong, there might be usable facts in the newer end? So why just cavalierly reject it?

I believe that despite the claims of many early American families to royal ancestry, the vast majority of trees with those entries contain unsubstantiated matches. I can usually determine that easily be looking at a couple of early entries and seeing that they contain no original sources.

There is a line in my wife's family that traces to the Plantagenets, however when I tell people this I always say that at some point research merges with myth, so anything that far back is highly speculative.

I do acknowledge that there are exceptions to this, but they are unlikely to be related to my Irish immigrant family!


" No big deal, but does anyone know if this is a built-in default that can be changed?"

Ancestry said on it's Facebook page that the default is the last view used, which is never true in my case, because I have never found a use for the Family View and always switch to the Pedigree View.

I have found the only way to find the connections to my matches is to go through the trees. I have found the Shared Surnames list to be of little use. Even the vertical surname list doesn't include all the surnames.

I got my results back about a month ago. I've had about 2200 matches (including the low & very low confidence range). I've deleted about 250 of them due to no tree or a private tree. I have already given up messaging private trees. Even 2 matches at 2nd & 3rd cousin range with hints have not answered. I still have about 1500 left to go through. Sigh.

I've found connections in over 70 trees. Some have 3 & 4 different matches in one tree on different lines. Most are at the low & very low confidence range. (I have 3 great grandparents whose ancestors all go back to the colonies.)

I have found the size of the tree doesn't matter. I looked at one tree this morning that had 30,000 people listed. Not sure where they were, but it didn't take me any longer than usual to get through that tree. (I only go through the Pedigree View.) I have found connections in the 1600s in trees of less than 200 people. Those trees only had one or two lines that were well developed but they were the right ones.

I have yet to have a major break through on my 2 most frustrating brick walls. But I have been able to take some lines further back. I've confirmed quite a bit of my tree. Found a Mayflower ancestor. I descend from both of his sons & my matches seem to descend from all different grandchildren.

I look forward to Ancestry releasing the raw data. I just hope it is in a usable form, like Family Tree DNA's chromosome browser, their in-common-with feature & the ability to take the data to Gedmatch.

That said, for me Ancestry has proved to be a good value. Most of my FTDNA matches don't list their surnames, upload a tree, or answer emails. It has been useful but I have made greater strides with Ancestry. But with A LOT of work! The thing that keeps pushing me is that the matches keep coming & I want to have a look at all of them & not get further behind! Who knows where the info that causes those brick walls to crumble may be!

I have been very fortunate in that I have found probable families for four "lost" grandmothers. That means several matches to the same surname family. I am still sifting through the poorly documented trees and old documents trying to find actual parents but it is much more than I had before. That being said, the families are all Colonial America and that is where I am finding most of my matches.

And one of the reasons I have a big tree is because so often it is the uncle's cousin who writes the little genealogy booklet that lists my ancestors information.

I have noticed that many so called genealogy DNA experts know gobs about STRs and haplogroups and very little about recent (last 400-500 years) family history research.

One of the tricks I use is to look for quirky spellings and surnames of unconfirmed lines in the long name list. That has proven useful for me.
