The "Examples of Native People" chart shows individual ethnicity estimates for five sample individuals native to each region. Individuals whose family trees suggest deep ancestry in a particular region are considered "native" to that region. AncestryDNA has gathered thousands of these "native" DNA samples from across the globe. Members of our native sample collection are unique because their family has lived in their region for centuries. Yet, even these natives are not 100% similar to their own region. That is because every region has some degree of admixture.

The "Examples of Native People" chart is created from the same larger data set from which the AncestryDNA Reference Panel is drawn. To best understand this chart, it is important to understand a few things about descriptive statistics, as well as the AncestryDNA reference panel. Learn more about the AncestryDNA Reference Panel.

The Boxplot chart

In descriptive statistics, a box and whisker chart (or "boxplot") is a convenient way of graphically depicting numerical data. The boxplot shows the smallest observation (sample minimum), lower quartile, median, upper quartile, and largest observation (sample maximum). We compare each native person's DNA to the AncestryDNA Reference Panel to estimate his or her ethnicity. This example box plot shows the distribution of the ethnicity estimate for a given region across all of our samples native to that region.

= half of the samples fall in this range — 25% of samples are above it, and 25% are below it

= marks the median (half of all samples are above or below this line. this marks the typical native

= the lower 25% of samples fall in this range

= the upper 25% of samples fall in this range

For regions that are more homogeneous, most samples have a high percentage of estimated ethnicity from their native region. For more diverse regions (those with significant DNA overlap from neighboring regions) we see a wider range of estimates among the native sample collection.

We simplify this box plot chart by showing five individuals for each region, corresponding to the minimum, 25th percentile, median, 75th percentile, and maximum ethnicity estimates for all natives of a given region.

Let's say a person is estimated to have 100% ethnicity from his native region. So, the chart would show that some individuals appear to have nearly all of their DNA coming from this region:

For this sample region, the median amount of ethnicity for natives of this region is 84%; in other words, one half of all natives have less than 84% ethnicity to their native region, and half have more than 84% to their native region. We create a sample individual and label them as the "typical native" for the region:

We continue through the box plot to fill in the remaining example ethnicity estimates. Below is the full bar chart for a region.


Still curious to understand more? Cool--we're glad you're as interested in genetics as we are. Check out our white paper on ethnicity prediction.