Previous cousin statistics charts haven’t shown the differences in shared DNA between paternal and maternal relatives.
The average recombination rate in mothers is about 42. Conversely, genomes in fathers only recombine about 27 times, on average. This leads to a conclusion that’s intuitive to geneticists: More recombination decreases variance, leading to narrower ranges in shared DNA for maternal relatives. Less recombination results in more variance, which is why fully or predominantly paternal relatives can share a much wider range of DNA. This phenomenon has been blogged about by Graham Coop.
You can judge the accuracy of a shared DNA chart or table by the known standard deviations of some of its data points.
On this page you’ll find standard deviations and ranges of shared DNA for 3/4 siblings and double 1st cousins that you can’t find anywhere else. Other popular sources even have the averages listed wrong for those relationship types. Elsewhere on this site you can find statistics for other complicated types of relationships, including pedigree collapse or the combination of multiple DNA kits to reproduce some of your ancestors’ DNA, neither of which are available anywhere else.
Data shown below are reported in percentages, which are more universal across genotyping platforms. If you want to see cM ranges for a particular site, please click one of the links below:
Table 1. Shared DNA between siblings. Standard deviations for relatives for which values are available in the literature to compare to are given one extra decimal point here to show how closely they approximate known values.
It’s hard to say which is a bigger advantage for this method of computing shared DNA averages and ranges, that it’s the most accurate method or that it can compute any combination of relatives. The latter function is illustrated below, as the model easily computes any type of three-quarter sibling (3/4 sibling) or double first cousin.
Table 2. Shared DNA between six different types of 3/4 siblings. (Three-quarter siblings.) HIR = ‘half-identical regions,’ where one of the two chromosome homologues matches. FIR = ‘fully-identical regions,’ where both copies of a chromosome match. HIR + FIR = all of the points on chromosomes where two people match once plus all of the points where they match on both copies. HIR counting includes FIR bp, but only counts them as if they’re half-identical.
Table 3. Shared DNA between double first cousins. All parameters are the same as for Tables 1-2.
Table 4. Shared DNA for grandparents and some of their descendants. All parameters are the same as for Tables 1-3.
Table 5. Shared DNA for descendants of grandparents, continued, for half-relationships. All parameters are the same as for Tables 1-4.
Table 6. Shared DNA for great-grandparents and some of their descendants.
Table 7. Shared DNA for second cousins. I was surprised to see that some second cousins may not share any DNA.
Table 8. Shared DNA for 2nd great-grandparents.
Table 9. Shared DNA for 3rd great-grandparents.
I hope you’ve found these results useful. More will be on the way.
Feel free to ask a question or leave a comment. And make sure to check out these ranges of shared centiMorgans, which are the only published values that match peer-reviewed standard deviations. Or, try a calculator that lets you find the amount of an ancestor’s DNA you have when combining multiple kits. I also have some older articles that are only on Medium.
I have just found your site, and I am very pleased to see that you have built the engine that does the underlying work of a tool that I have envisioned.
My envisioned tool’s front end, the means of specifying the configuration of whose DNA kits you have, is the GUI that Jonny Perl has created for WATO. The back end is what it looks like you have already created.
So, I am struggling with your front end, trying to figure out how to input the configuration of actual kits we already have for descendants of the 8 children of one specific couple of my 6th great grandparents. We really do have kits of a lot of those descendants, and I would like to see just how much coverage we have put together of those 6th greats … and also figure out which of the descendants who have not yet tested would give us the biggest increase in our coverage of this ancestral couple.
The limits of your front end seem to be (as best I can figure out) that I can only specify close relatives (first cousins, aunts, uncles) when what I need is to be able to use 4th or 5th or 6th cousins who descend from specific non-intersecting lines of the children of the couple.
I also have not been able to figure out how to specify half-siblings.
So, my problem is with your front end. I just cannot figure out how to properly provide the real-world input configuration of all the kits that we have.
Good morning Mr. Johnston,
Thank you for your comment. I’m actually very surprised that you were able to get to the page . It’s been online for a couple of years and I almost always get the message “calc-combined.dna-sci.com is almost here!” when I try. I can sometimes get to it if I clear my browser history or open an incognito window, though. I’m going to have someone who’s better at web development than me look at it in a week or so.
As for the half-siblings, it isn’t necessary in this program. The calculator tells you how much DNA you cover for one ancestor out of an ancestor pair. It happens to be the same for either ancestor, so you can enter a number of siblings and that counts towards your mother or father. Or for any paternal or maternal ancestor. And you can use that same number for the other ancestor of a given pair and generation.
In real life the gender of the ancestor does matter, as well as the gender path to get to that ancestor. That’s for DNA coverage. For what you share with an ancestor, their gender doesn’t matter and the gender path doesn’t matter.
The model used for this calculator was my first genetic model. It’s a very simple model of about 97 marbles. I’ve since developed, through a couple of steps, a much more complicated model. The one I’m using now has homologous autosomal chromosomes and recombinations. It’s very accurate because it was trained on published standard deviations. I have new calculations for DNA coverage here from that model. I like to say to use those numbers, but the differences aren’t actually huge.
As far as allowing second cousins, half-cousins, cousins removed, etc., it would have been a lot of additional code. If I include that in the calculator someday, it will be for a new calculator based on the newest model. You can add in the averages yourself, although you wouldn’t be able to find exact ranges.
Any relationship to Johnstons in Canada who came from County Fermanagh?
Regards,
Brit
I’m happy to answer your question about Canadian Johnstons from Co Fermanagh in a private e-mail but not on a public comment board.
I am wondering now if you would be willing to team up with Jonny Perl to provide a back-end to do the calculations for the front end that he could provide with his WATO GUI?
Of course I figured you wouldn’t give specifics about Johnstons on here. I keep at least a couple of generations of my family history private most of the time, too.
I definitely like collaborating on projects. I’ve been eager to do so already.
For a moment I thought that you might be looking for a tool like WATO, but for combined kits, which would be the reverse of my tool. But I don’t see how that would work.
Since we’re going to email privately about Johnstons anyway, we can try to clarify what kind of tool you’re looking for over email.
If you have Canadian Johnstons from Co Fermanagh, I really would like to hear from you via private e-mail.
Here are two of my web pages that relate to this subject:
Pre-Confederation Canada Johnstons – https://www.wwjohnston.net/famhist/canada-johnstons.htm
Johnston Online Family Trees – https://www.wwjohnston.net/famhist/johnston-tree-links.htm
My Daddy’s paternal grandparents each had a sibling who married each other. What amount of CM would my Father and their children share
Hi Bessie,
There are a couple of ways to approach this. One is by entering a cM value into the double cousin predictor here: https://dna-sci.com/tools/orogen-mult-unw/
Your dad’s father would’ve been double 1st cousins with the children of the other couple. That means that your dad would be double 1st cousins once removed (2x1C1Rs) with those children. If he has matches in the 600 to 900 cM range, the tool would show a very high probability of them being his 2x1C1Rs.
The other way to approach it is from what we learn in this article: https://dna-sci.com/2021/01/05/can-you-just-add-the-averages-and-ranges-for-double-relationships/
Double 1st cousins share 25%, on average. That means that 2x1C1Rs would share about 12.5%, on average, which translates to about 872 cMs. The range of possible DNA will be narrower than if you added the ranges from the normal 1C1R relationship. Also, you can expect the min. and the max. to be about half of what the double 1st cousin range is. I’d expect 2x1C1Rs to share about 600 to 1,200 cMs based on that.
I have been frustrated by the lack on any minimum segment size specification for the relationship tool that invites input of total cM and number of segments. I’ve read or at least searched and/or skimmed through all of the references provided, and come away still puzzled.
I should add that my puzzlement comes from relationship results about a distant and totally unknown relative who recently appeared among my GEDmatch DNA Matches in 29th place, yet according to the caculator shares more of my DNA than a known
3C1R listed first among distant relatives when total DNA is calculated at the 3cM threshold.
The unknown new arrival at 3cM threshold shows a total of 165.7 cM in 38 segments, the largest 11cM, while the known 3C1R shows 125cM, 7 segments, the largest 38.
At the 4.2cM cutoff the former’s total cM is reduced to 72.1cM, 11 segments, at 5cM, 58.4cM and 8 segments. But on my GEDmatch One to Many list of matches, this kit is only shown as sharing 35.4cM, the number calculated by the Autosomal One on One Tool set to 7cM cutoff, and counting only 4 segments.
Have I understood correctly your suggestion that a 4.2 minimum segment size produces a more accurate total cM count than either 3, 5, or 7? If so, I would suggest you display this suggestion on your relationship calculator’s page.
BTW, if the GEDmatch One to Many List ranked matches using your suggested 4.2cM cutoff, profile #29 of my 3,000 matches would instead be #5. So I think the answer to this question is significant for anyone trying to establish uncertain lines of descent.
Thanks for the message. I think you’re referring to this article: https://dna-sci.com/2021/09/14/small-segments-are-we-asking-the-wrong-question/
In that analysis I discovered that, while many small segments are false, cutoff values such as 7 or 8 cMs remove more true cMs than the number of false cMs that they keep out. It looks like the break even point is more like 4 or 5 cMs. So while a strong case can be made for not showing us very small segments, it seems that some should be included in the totals.
In order for relationship predictors to work better for cM totals that include segments of 4.2 cMs and above, the testing sites would have to first report total cMs with that different cutoff value. If predictors were built that way now, they’d be less accurate. Currently, predictors are made based on the actual cutoff values at DNA sites, such as 7 or 8 cMs.
If I recommended a cutoff value of 4.2 cMs on the relationship predictor page I’m worried that people would think I was telling them to enter that value into the relationship predictor.