The "Ancestry" input box can be used for Ancestry, FTDNA, and MyHeritage. The methods used at 23andMe are different in that total IBD sharing is used rather than HIR and some segments as low as 5 cMs are included. GEDmatch predictions can be obtained by using the 23andMe cMs box and checking for "HIR."
Total identical-by-descent (IBD) sharing is the best method to distinguish between relationship types; fully-identical regions (FIR, or IBD2) is next best; and half-identical regions (HIR) is least preferable; IBD = HIR + FIR; cM = centiMorgan; 1C1R = 1st cousin, once removed.
This tool does not yet include population weights, which are easier to implement with traditional relationships. Double cousin relationships are less common and there may not be a perfect way to add population weights to a relationship predictor that includes them, although I have an idea.
There's a new relationship predictor that lets you enter the number of segments for amazingly good results, but it doesn't yet include double cousins.
All probabilities are for autosomal DNA only. Please subtract any X-DNA before using this tool.
The above probabilities assume no endogamy or other pedigree collapse. Those cases should be treated separately.
*Parent/child and full-sibling relationships are easy to tell apart. Parent/child relationships consist of a HIR match across the whole length of the genome. Full-siblings share 12.5% FIR, on average. Genotyping sites will take this into account when they label your relationship. You can usually trust their labels for parent/child or full-siblings (which they call "siblings").
For many of the double cousin relationships, two types are included here. One type cannot share FIRs and the other can. Including both was necessary because there are differences between the two in the HIR amount, which is the only way cM values are reported at Ancestry. For HIR comparisons of relationships, the type that cannot include FIR will actually have a higher cM value. This is because a double 2nd cousin (2C) pair, for example, who cannot share FIR, will have an average shared amount of 6.25%, all of which comes from HIR. Conversely, a double 2C pair who do share FIR will usually only have about 6.05% HIR, the other 0.2% coming from FIR. Also, double cousin types with possible FIR will also have slightly lower amounts of full IBD sharing at genotyping sites. That’s because some segments that are FIR will be below the low-cM threshold, causing twice the amount of shared cM to be discarded for those segments when the cutoff is applied.
How do you know if your double relationship could include FIRs? Here’s the test: If both of your parents are related to the match and you’re related to both of your match’s parents, then you could share FIR with your match. Please note that as double relationships get more distant, it becomes less likely that they’re aligned in exactly the right way to produce FIR.
The double relationships that are currently included are as follows: 3/4 siblings (four types), double 1C, 1C + half-aunt/uncle/niece/nephew, 1C + 1C1R, 1C + half-1C, double 1C1R (w/ or w/o FIR), double 2C (w/ or w/o FIR), double 2C1R (w/ or w/o FIR), and double 3C (w/ or w/o FIR).
Do you have a suggestion for a type of double or multiple cousin relationship to add to this tool? If so, please leave a comment here.
Probabilities are included for other (non-double) relationships as far back as 8C1R. The huge advantage of this tool, other than the accuracy of the data, is that it treats close relatives as not being in the same group because the curves are significantly different. For distant relatives, there's much less certainty about the genealogical relationship for your DNA matches. Matches as low as 8 cM are allowed here. While the relative probabilities are accurate for the relationship types shown, one also has to consider that the relationship is farther back. With unweighted (by population) predictions, the most probable relationship is never 4C or more distant, even for the lowest cMs. Not only are very low cM values difficult to assign to a recent ancestor, but segments of 20 cM or 30 cM may be on pile-up regions and therefore come from very distant ancestors.
Any of the probabilities shown above are only relative to the other relationships listed, therefore they’re only meaningful in comparison to the other relationships.
Totals will not always add up to 100%. When more relationship types are possible, the chances of rounding errors increases. For more information about the methodology and discoveries associated with this tool, click here. There's also now a published scientific paper about relationship predictions.
The data used for these predictions came from Ped-sim. In this case, the refined genetic map of Bhérer et al. (2017) was used as well as the crossover interference parameters of Campbell et al. (2015).