How much of an ancestor’s or relative’s DNA do you reproduce when you combine kits of multiple testers?

I wrote the original article on DNA coverage from multiple kits a few years ago. I included tables that showed the amount of an ancestor’s autosomal DNA you could reproduce if you combined kits from multiple relatives. In those tables were exact averages and approximate ranges. The averages can often be calculated in your head, but ranges of reproduced DNA can’t be calculated.

My simulations have gone through several updates since that first article. It was originally a simple marble model. I didn’t trust the ranges very much at that time. I also wanted to obtain standard deviations for paternal and maternal relatives so I could show the true differences in ranges between the two. It took a long time for anyone to publish those. By then I had developed an X model and then used some of that code to make an autosomal model with two chromosome copies. When the standard deviations of Veller et al. (2019 and 2020) were published, I started treating paternal and maternal recombination differently in my new model.

The results of this model are the only ones that are trained on sex-specific standard deviations that are available in peer-reviewed literature. And my tables of shared DNA are the only ones that acknowledge the difference in paternal and maternal shared DNA ranges. Now I’ve finally updated the project that was once my most popular among my readers. You could use these results to estimate how much coverage you’d obtain by making a Lazarus kit at GEDmatch.

Table 1. Percentage of a father’s DNA reproduced by the number of his children with DNA kits given in each row.

Table 2. Percentage of a mother’s DNA reproduced by the number of her children with DNA kits given in each row.

These values mostly agree with the very simple marble model that I originally used to calculate DNA coverage. The main difference I see is that it no longer appears possible to reproduce a parent’s entire genome with five children testing, whereas the marble model had found that about 2.5% of five-children families could do so. Rather, given that there probably aren’t hundreds of thousands of families that have five or six children tested, the 99% confidence interval is the best one to use for this question. I think it’s safe to say that there are no families of five or fewer children who have all had their DNA genotyped and managed to reproduce all of a parent’s SNPs, since it doesn’t appear to happen once out of 500,000 families. However, there may be some families of six who have all of their father’s SNPs covered.

Table 3. Percentage of a paternal grandfather’s DNA reproduced by one’s own kit plus the given number of paternal aunts or uncles for each row.

Table 4. Percentage of a paternal grandmother’s DNA reproduced by one’s own kit plus the given number of paternal aunts or uncles for each row.

Table 5. Percentage of a maternal grandfather’s DNA reproduced by one’s own kit plus the given number of maternal aunts or uncles for each row.

Table 6. Percentage of a maternal grandmother’s DNA reproduced by one’s own kit plus the given number of maternal aunts or uncles for each row.

Table 7. Percentage of a paternal grandfather’s DNA reproduced by one’s own kit plus the given number of siblings and paternal aunts or uncles for each row.

Table 8Percentage of a paternal grandmother’s DNA reproduced by one’s own kit plus the given number of siblings and paternal aunts or uncles for each row.

Table 9Percentage of a maternal grandfather’s DNA reproduced by one’s own kit plus the given number of siblings and maternal aunts or uncles for each row.

Table 10Percentage of a maternal grandmother’s DNA reproduced by one’s own kit plus the given number of siblings and maternal aunts or uncles for each row.

I hope you’ve found these results useful. More will be on the way. If you want to see more results now, you could check the ones with less accurate ranges in the original article. Or you can calculate results yourself with the online calculator made from that simpler model.

If you had access to the most accurate relationship predictor, would you use it? Feel free to ask a question or leave a comment. And make sure to check out these ranges of shared DNA percentages or shared centiMorgans, which are the only published values that match peer-reviewed standard deviations. Or, try a calculator that lets you find the amount of an ancestor’s DNA you have when combining multiple kits. I also have some older articles that are only on Medium.