These statistics are now available for the first time ever
I know what you’re going to say. X-DNA is too variable for us to deduce anything from the averages or ranges. I say that we should always seek to improve our understanding of the natural world when we have the chance, especially when it has to do with shared DNA. Plus, you will learn something from these numbers.
Over two years ago I made an attempt at modeling the X Chromosome. It was another year before I tried again and completed an x-model that produced averages and ranges for a DNA tester and only their ancestors. Although I had every intention of finding averages and ranges for other relatives, I wasn’t satisfied with the model because I had no data by which to validate it. That has changed, as I now have a dataset of over 100 sibling pairs and their shared centiMorgans (cM) of X-DNA. I’ve re-written my model from scratch and validated it with the standard deviation from that dataset. I admit that that dataset is small, but it’s growing. And, although the model isn’t validated with peer-reviewed statistics like my autosomal model, I’m now quite satisfied with this new model of X-DNA sharing.
I programmed the model to have an average of 1.949 recombinations per meiosis event. The dataset used to validate standard deviations includes shared X-DNA for siblings who share a mother. I recorded only the cM from maternal chromosome copies, i.e. no paternal X-DNA from full-sisters, and no data were recorded for paternal half-siblings.
A note on reported percentages
I have a method for reporting percentages of shared X-DNA that works for either male or female DNA testers. All that is required is to report the amounts as a fraction of only the applicable portion of a woman’s X Chromosome, and not the other copy. This is how I reported it in March of 2020 in Figure 1 below.
Figure 1. Family tree showing both X Chromosome inheritance patterns and shared DNA. Percentages are compatible for both male and female DNA testers, since amounts are reported as a proportion of only the possible shared chromosome copy, and not the other copy. Louise Coakley has written a great blog post about the X Chromosome and its helpful inheritance patterns. I made the above image in early 2020 and only recently realized that it resembles the charts in her blog, which I think was written long before mine.
The only thing I’ll do differently here involves full-sisters. They can, and most likely will, share X-DNA on both copies. I’ll maintain some consistency in that I’ll report the shared DNA from both copies as a percentage of the possible shared DNA, i.e. both copies in this one case. This will lead to full-sisters sharing 75% X-DNA on average. If I had reported it as a percentage of one copy, as for all other relationships currently shown, full-sisters would have shared 150% DNA, on average. If I had reported all amounts as a fraction of the total, including both copies for women, then the statistics for women and men wouldn’t have been comparable. I’ll also report shared X-DNA in cM, in which case none of the above is an issue.
Tables of shared X-DNA between close relatives
Table 1. Shared percentages of X-DNA for close relatives. Pat. = paternal; Mat. = maternal, D. = daughter, Ch. = child, CI = confidence interval, 0% Shared = the percentage of relative pairs who share no X-DNA, 100% Shared = the percentage of relatives who will share one whole copy of the X Chromosome, or both copies for full-sisters. The confidence intervals should be understood as follows: 99% of individual pairs for a given relationship type will fall within the 99% CI lower and upper values. When reading this table, it might help to imagine a full-sibling pair who are brother and sister to each other and have had their DNA tested. All of the above percentages apply to the sister. However, the labels in the first column that are highlighted in purple do not apply to the brother. Except for full-sisters, these relationship types share no DNA with the brother. The brother’s shared X-DNA with a full-sister or paternal half-sister can be found in the “Maternal Siblings” row. All labels with white backgrounds are applicable for both the brother and the sister.
One of the most interesting aspects of the data above is that a person will share 75% X-DNA, on average, with their maternal aunt. This might seem too high, since many close relatives only share 50%, on average. But the 75% value is correct. A good way to think about it is this: if the percentage of X-DNA shared between full-sisters were reported as a fraction of one copy of the X Chromosome, as I have done with every other relationship here, then full-sisters would share 150% X-DNA on average. So if your maternal aunt shared 150% of one X Chromosome copy with your mother, then you could be expected to share about half of that, or 75%, with your maternal aunt.
Because of this high amount of sharing with maternal aunts, we see that the lower end of the 99% confidence interval (0.5th percentile) is fairly high (9.33%) for that relationship type. This has great implications for distinguishing between half-siblings and maternal aunts, which both have average autosomal shared DNA of 25%. While the following rule isn’t 100% conclusive, you’re unlikely to share below 9.33% with your maternal aunt. A value that low is much more likely to indicate a maternal half-sibling. You can also see this in the values of 0% shared DNA. While it isn’t very common for maternal siblings to share no X-DNA (1.01%), it’s about 14 times more likely than for a person to not share X-DNA with their maternal aunt (0.0732%).
Another thing to note is that the percentage of shared DNA for any relative isn’t affected by the sex of the DNA tester except for two cases. One of them is obvious—when the DNA tester is a male, the shared X-DNA is zero for all paternal relatives, unlike for a female DNA tester. The only additional exception shown above is for aunts/uncles/nieces/nephews. As far as cousins are concerned, only the sex of the intermediate relatives affects the percentage of shared DNA (except for a male’s paternal relatives), like for autosomal DNA. And in both cases, the amount of shared DNA isn’t affected by the the sex(es) of the most recent common ancestor(s).
Those are the only statistics I have so far in percentages. Below you can find the same statistics reported in cM as would be seen at 23andMe and at GEDmatch.
Table 2. Shared cM of X-DNA for close relatives as would be found at 23andMe. The table should be read as described in Table 1.
Table 3. Shared cM of X-DNA for close relatives as would be found at GEDmatch. The table should be read as described in Table 1.
Of course, I have a lot of work left to do. I’ll be reporting the amounts of shared X-DNA for other relationship types such as half-aunts/uncles/nieces/nephews, half-cousins, great-grandparents/grandchildren, 2nd cousins, 2nd cousins once removed, etc. as I get to them. I hope you find these statistics useful and please let me know if you have any suggestions.
The cover photo shows the path over which the most X-DNA is preserved over many generations in my family. If you had access to the most accurate relationship predictor, would you use it? Feel free to ask a question or leave a comment. And make sure to check out these ranges of shared DNA percentages or shared centiMorgans, which are the only published values that match peer-reviewed standard deviations. Or, try a calculator that lets you find the amount of an ancestor’s DNA you have when combining multiple kits. I also have some older articles that are only on Medium.