The Parameters of the Rotating qpAdm models
Mbuti.DG, CHG_Satsurblia, Mongolia_North_N, ONG.SG, IRQ_PPN, PPN, Russia_AfontovaGora3
Iran_GanjDareh_N, Serbia_IronGates_Mesolithic, WSHG, Turkey_N, EHG, Iran_C_SehGabi, ARM_Aknashen_N, ARM_Masis_Blur_N, Iran_HajjiFiruz_N, CHG_Kotias, Tajikistan_C_Sarazm, Azerbaijan_lowlands_LN, Hun_Vinca_MN, Ukraine_N, Russia_Caucasus_Eneolithic
Working Models for Khvalynsk_Eneolithic
qpAdm Models
Issues with carbon dating
The Middle Don Eneolithic samples from Allentoft et al 2022 preprint
What this suggests is:
Working Models for Steppe_Eneolithic
a. The Khvalynsk samples are older. Eg, Wilkin et al 2021 supplement states (emphasis mine)
These ruminant grazers would not have been subject to the reservoir effect, with dates of 4450–4355 cal BCE from Khvalynsk I and 4448–4362 cal BCE from Khvalynsk II. These two dates provide the best estimate currently available for the true age of the two cemeteries.
Whereas the 3 Steppe_Eneolithic samples are dated in Wang et al 2019 supplement as -
- PG2001: charcoal 4336-4173 cal BCE
- PG2004: 4233-4047 cal BCE
- VJ1001 with 2 dates (unknown where each came from, usually the younger one is without reservoir effect): Dating: 4332-4238 cal BCE & 4229–4065 cal BCE
Each of these three dates is younger than the two secure dates at Khvalynsk I & II.
b. There are no successful models for Steppe_Eneolithic after removing Khvalynsk_En from the list of possible sources. No 4 or even 5-source models with EHG as source instead of Khvalynsk_En provide p-values > pcutoff.
c. Golubaya Krinitsa samples with 25% 'CHG' ancestry as claimed by Allentoft et al are dated to 5300BCE. These seem similar to Khvalynsk samples rather than Steppe_Eneolithic, and therefore the Khvalynsk profile should be assumed to be older.
Steppe_En 4 source models |
In all the passing models, 3 sources are constant - Khvalynsk_En, CHG & Sarazm_En. The 4th source varies but the common aspect is that those 4th sources provide minor Anatolian/Levantine ancestry.
G25 model for Progress_En |
The legitimacy of using Sarazm_EN as a source?
Working Models for Yamnaya_Samara
Yamnaya 4-source models |
It becomes immediately clear that the only passing models need Steppe_Eneolithic as a source. This suggests that Steppe_En or a population very similar to it is a true source for Yamnaya_Samara. We also see some additional Anatolian heavy sources from the south of Caucasus (7-15% ARM_N or HajjiFiruz_N). The 3rd source required is something with both EHG and WHG ancestries (something similar to Ukraine_N samples). 14-20% ancestry of Yamnaya is derived from such a source which lies on the EHG-Serbia_IronGates cline.
The Uniparental Markers
SC Asian Impact on Neolithization of the Steppe?
Whilst domesticates are integrated into Bug-Dniester subsistence strategies from the end of the VIIth millennium BC in southwestern Ukraine, the earliest reliable isotopic evidence for a reliance on terrestrial domesticates further east, in the Dnieper region, only occurs at the onset of the Eneolithic period, with the Trypillia farming culture, and in the Dnieper River region at the site of Molyukhov Bugor at c. 4000 calBC.
Eastern Caspian Pottery in Western Steppe till Baltic
Early pottery on the territory from the Eastern Caspian Sea and Aral Sea to Denmark reveals a certain typological similarity. It is represented by egg-shaped vessels with an S-shaped profile of the upper part and a pointed bottom.
From our point of view, a certain morphological and technological similarity of the earliest ceramic ware from the Central Asian area to the southwestern shores of the Baltic Sea is not associated with stage-bystage phenomena. Apparently, it reflects the process of the spread of skills in pottery making over a considerable distance as a result of climatic and economical factors. In our opinion, the transfer of these skills occurred as a result of the interaction of representatives of multicultural groups with a complex hunter-gather economy in border areas and migrations of small groups of the population.
The authors conclude that the pottery found at the sites of Ayakagitma, Dzhebel (Jebel), and Lyavlyakan in Uzbekistan, approx earliest dated to ~6100BCE made its way to Elshankaya (Lebyazhinka IV at Samara, followed by Samara culture where Samara_HG sample was found). The authors further conclude that this type of pottery spread from east of Caspian to as far as the shores of the Baltic sea, anddoeso not rule out small migrations.
Interestingly, note that Lasota-Moskalewska et al (2014) proposes Ayakaitma (6000-5400BCE) as the earliest site of horse domestication, and assert that 30-40% of animal remains belong to the Equidae family, some of them being horse bones, rest belonging to domesticated livestock. At the very least, it proves that horses were present in this region. The authors were part of the Polish-Uzbek Archaeological Mission.
A much more detailed analysis of the steppe neolithic pottery has been done by FrankN here. He comes to similar conclusions.
39 comments:
There is already a sample from that area and near that time, called ZamanBaba_N. But that is poor quality and Reich lab said it was 'contaminated' when I made them aware because it had lots of 'steppe ancestry'. If this new Turkmenistan sample is published it will be very interesting and useful.
What also interesting to me is that Iran_N and Steppe seem to form their own clusters but Sarazm, which seems so far from Iran and there are few Iran_N like settlements there but still seems to cluster with Iran_N and other farmers.
Ofcourse the difference between older (EHG) and newer (Steppe) populations in Eastern Europe is definitely coming from the South, rather than mixing and forming in Europe.
I have not seen a reference to Hotu cave, can you include it in the analysis? It seems that there is an affinity of Hotu cave and the southern Urals, as said in the quotations here: http://new-indology.blogspot.com/2014/10/can-we-finally-identify-real-cradle-of.html
"the geometric microliths and points found in the mesolithic sites of the southern Urals are identical with the inventory of the remains found in Belt Cave, Hotu, Shanidar B, Karim Shahir, Zawi Chemi Shanidar, Jarmo and other sites in southwestern Asia - the area of the origin of domestication during the tenth to eighth millennium bc." "Taking into account that wild sheep are absent from the Urals and the surrounding areas, and that their region of origin was northern Mesopotamia and northern Iran, it can be assumed that stockbreeding was introduced to the Urals from Iran and the southern shores of the Caspian. The introduction of the 'southern' stockbreeding elements may date well back into the Mesolithic, possibly to the date of the appearance of the geometric microliths (ninth to seventh millennia bc)."
Hotu mesolithic sample is of low quality, not fit to be used as source.
That's a pity... BTW, in Leiden during the debate Reich said that a good source for the steppe ancestry is the Caspian Neolithic, he said to the NW of the Caspian. He did not mention the sites, so I don't know exactly what he meant, but an eastern origin would be confirmed. Do you have any idea of the samples and sites involved?
In the present time, most of Turkmenistan's population is located near Iran's frontier. If the present population density has a similar distribution as that of entreolithic, and without borders, they should be identical to that of North Iran and East Caucasus. All of these regions belong to the same geologic formation, the Kopet Dag moutain range.
This region links trivially the caucasus with BMAC and yields routes for traders and generals, from Mongolians, Turks and Greeks, like Alexander, to move into China (Tocharians) and India. There are many, many rivers small that can support troops and travelers. But they funnel the routes, making it a very much disputed region of the silk road.
Indeed, the Silk Road was also the path for armies. You can see that Alexander followed exactly the paths of the Silk Road and conquering them meant conquering the regions in between.
The Kopet Dag isa part of the Alpide Belt, which is responsible for a large portion of the Silk Road. Perhaps this is what yield the spread of PIE: the need to trade ores to make bronze in thes moutains ranges
I think that if you look at this, it seems reasonable to see it as a conduit for IE languages. https://en.wikipedia.org/wiki/Alpide_belt
"BTW, in Leiden during the debate Reich said that a good source for the steppe ancestry is the Caspian Neolithic, he said to the NW of the Caspian. He did not mention the sites, so I don't know exactly what he meant, but an eastern origin would be confirmed. Do you have any idea of the samples and sites involved"
Nope, no clue.
@daniel
Yes the silk road was important even in the bronze age and earlier.
Added this section:
Eastern Caspian Pottery in Western Steppe till Baltic
Andreev & Vybornov 2021 note,
"Early pottery on the territory from the Eastern Caspian Sea and Aral Sea to Denmark reveals a certain typological similarity. It is represented by egg-shaped vessels with an S-shaped profile of the upper part and a pointed bottom."
They conclude thus
"From our point of view, a certain morphological and technological similarity of the earliest ceramic ware from the Central Asian area to the southwestern shores of the Baltic Sea is not associated with stage-bystage phenomena. Apparently, it reflects the process of the spread of skills in pottery making over a considerable distance as a result of climatic and economical factors. In our opinion, the transfer of these skills occurred as a result of the interaction of representatives of multicultural groups with a complex hunter-gather economy in border areas and migrations of small groups of the population."
The authors conclude that the pottery found at the sites of Ayakagitma, Dzhebel (Jebel), and Lyavlyakan in Uzbekistan, approx earliest dated to ~6100BCE made its way to Elshankaya (Lebyazhinka IV at Samara, followed by Samara culture where Samara_HG sample was found). The authors further conclude that this type of pottery spread from east of Caspian to as far as the shores of the Baltic sea, and do not rule out small migrations.
Interestingly, note that Lasota-Moskalewska et al (2014) proposes Ayakaitma (6000-5400BCE) as the earliest site of horse domestication, and assert that 30-40% of animal remains belong to the Equidae family, some of them being horse bones, rest belonging to domesticated livestock. At the very least, it proves that horses were present in this region. The authors were part of the Polish-Uzbek Archaeological Mission.
I remember reading this on adnaera a while back: https://adnaera.com/2019/01/11/how-did-chg-get-into-steppe_emba-part-2-the-pottery-neolithic/
Author shows archaeological evidence of a south and east of caspian influence on the steppe.
Yes, a very good article by FrankN
@Vasistha what I mean it is that the silk road was what transmitted the langauge group, not merely just important...
@Vasistha It is possible PIE was in a choke point of the silk road and become the lingua franca of the whole route and thus spread.
The analyses are very insightful as always! However, I think you should try adding 'Russia_Steppe_Eneolithic' in leftright list aka 'rotating sources' because Steppe_EN profile is contemporary to Khvalynsk and it deserves a spot. I ran the rotating models on ADMIXTOOLS2. Click here to view the results in google sheets.
Turns out, Steppe_EN beats Sarazm. I also ran Steppe_Eneolithic and Sarazm models in the original ADMIXTOOLS.
Khvalynsk_EN w/Steppe_EN (p-value = 0.735387)
Khvalynsk_EN w/Sarazm_C (p-value = 0.006064)
Now let's check gendstat of "Khvalynsk_EN w/Sarazm_C". The only gendstat with a significant Z-score shows that "Russia_Steppe_Eneolithic" is lacking in the model.
gendstat: Russia_Steppe_Eneolithic Ukraine_N 3.014
@Legend
"However, I think you should try adding 'Russia_Steppe_Eneolithic' in leftright list aka 'rotating sources' because Steppe_EN profile is contemporary to Khvalynsk and it deserves a spot."
This is not correct.
We have to select one of the populations to be the ancestor of the other. We cannot model Khvalynsk with Steppe_En as a source while at the same time model Steppe_En with Khvalynsk as a source. Given that both of these are extremely closely related populations, when provided as a choice Khvalynsk will indeed select Steppe_en as a source. This can also be seen via G25
Target: RUS_Khvalynsk_En
Distance: 2.2793% / 0.02279283
50.4 RUS_Karelia_HG
47.4 RUS_Progress_En
2.2 RUS_Samara_HG
0.0 CHN_Tarim_EMBA1
0.0 CHN_Tarim_EMBA2
0.0 GEO_CHG
0.0 IRN_Ganj_Dareh_Historic
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Seh_Gabi_C
0.0 IRN_Seh_Gabi_LN
0.0 IRN_Tepe_Hissar_C
0.0 IRN_TepeHissar_C
0.0 KAZ_Botai
0.0 TJK_Sarazm_En
Target: RUS_Progress_En
Distance: 2.8598% / 0.02859770
64.2 RUS_Khvalynsk_En
18.8 TJK_Sarazm_En
17.0 GEO_CHG
0.0 CHN_Tarim_EMBA1
0.0 CHN_Tarim_EMBA2
0.0 IRN_Ganj_Dareh_Historic
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Seh_Gabi_C
0.0 IRN_Seh_Gabi_LN
0.0 IRN_Tepe_Hissar_C
0.0 IRN_TepeHissar_C
0.0 KAZ_Botai
0.0 RUS_Karelia_HG
0.0 RUS_Samara_HG
An obvious method to get rid of this quandary is by not using either Steppe_En or Khvalysnk_En as sources for each other to figure out the source of the external ancestry. But the problem here is that Steppe_En cannot be modeled with EHG, and Khvalynsk_En as a source is a necessity (shown in point 2).
These are the 3 reasons why using Steppe_en as a source for Khvalynsk is problematic:
1. The Khvalynsk samples are older. Eg, Wilkin et al 2021 supplement states (emphasis mine)
"These ruminant grazers would not have been subject to the reservoir effect, with dates of 4450–4355 cal BCE from Khvalynsk I and 4448–4362 cal BCE from Khvalynsk II. These two dates provide the best estimate currently available for the true age of the two cemeteries."
Whereas the 3 Steppe_Eneolithic samples are dated in Wang et al 2019 supplement as -
PG2001: charcoal 4336-4173 cal BCE
PG2004: 4233-4047 cal BCE
VJ1001 with 2 dates (unknown where each came from, usually the younger one is without reservoir effect): Dating: 4332-4238 cal BCE & 4229–4065 cal BCE
Each of these three Steppe_En dates is younger than the two secure dates at Khvalynsk I & II.
2. There are no successful models for Steppe_Eneolithic after removing Khvalynsk_En from the list of possible sources. No 4 or even 5-source models with EHG as source instead of Khvalynsk_En provide p-values > pcutoff. The result of all 4 and 5-source models without khvalynsk are pasted in this workbook as extra sheets.
3. Golubaya Krinitsa (from Allentoft et al preprint) is apparently similar to Khvalynsk samples (25% 'CHG' similar to Khvalynsk) and is dated to 5300BCE, much earlier than Steppe_en.
The above issues don't arise when using Sarazm as a source (even though it is younger, because we have already checked that there can only be a one way flow between Sarazm and steppe populations. This is due to lack of EHG while modeling Sarazm. This lack of EHG in Sarazm is accepted by qpAdm (as shown in post) and also by G25.
Target: TJK_Sarazm_En
Distance: 3.1721% / 0.03172062
43.0 IRN_Ganj_Dareh_N
22.0 IRN_Tepe_Hissar_C
16.4 CHN_Tarim_EMBA1
16.0 GEO_CHG
2.6 RUS_Progress_En
0.0 RUS_Karelia_HG
0.0 RUS_Khvalynsk_En
0.0 RUS_Samara_HG
0.0 CHN_Tarim_EMBA2
0.0 IRN_Seh_Gabi_C
0.0 IRN_Seh_Gabi_LN
0.0 IRN_TepeHissar_C
0.0 KAZ_Botai
To put it simply,
In qpAdm we cannot have a reference or source (in this case steppe_en) which has received geneflow from the target (in this case khvalynsk).
However we can indeed use such a population if the geneflow is from source to reference/target (and not from target to source/reference) (Harney et al 2021)
Therefore, modeling khvalynsk_en with steppe_en as a source breaks several rules.
@daniel,
Iranian is at the centre of the Silk road. If you read the most ancient Iranian literature, Shahname, the whole economic and political dominance of that civilization is based on controlling the silk road. Numerous times they mention how they have things from china, india, west asia etc
They say stuff like "The King was brought tribute, the best silk from china, indian swords, middle eastern whatever"
And placing Iranian in the BMAC makes sense because Balto-Slavic and Germanic, and middle East all share Iranian linguistic and cultural things rather than Indo Aryan.
"But the problem here is that Steppe_En cannot be modeled with EHG."
Actually, the problem is in Caucasus-related source. Lazaridis et al. has categorically discussed this.
Examining individuals from the steppe (Fig. 3), we observe that in the post–5000 BCE period, Caucasus-related ancestry is added to the previous Eastern hunter-gatherer population, forming the Eneolithic populations at Khvalynsk (9) and Progress-2 (17); this ancestry persisted in the Steppe Maykop population of the 4th millennium BCE (17). However, all of these populations before ~3000 BCE lack any detectible Anatolian/Levantine–related ancestry, contrasting with all contemporaneous ones from the Southern Arc, which have at least some such ancestry at least since the Neolithic (11). In all later periods in the Southern Arc, Caucasus hunter-gatherer–related ancestry is never found by itself but rather is always admixed, to various degrees, with Anatolian/Levantine ancestry. This implies that the proximal source of the Caucasus-related ancestry in the Eneolithic steppe should be sought in an unsampled group that did not experience Anatolian/Levantine–related gene flow until the Eneolithic. Plausibly, this population existed in the North Caucasus, from which Caucasus hunter-gatherer–related, but not Anatolian/Levantine–related, ancestry could have entered the Eneolithic steppe.
"Each of these three Steppe_En dates is younger than the two secure dates at Khvalynsk I & II."
That's not a problem to be honest. Time of formation of their profile matters, those sample dates don't tell us which one formed first. We can find this out by running DATES.
If we set EHG & Iranian-related (CHG and GanjDareh samples pooled) as source populations and all three Khvalynsk samples as test population, we get a mean time of admixture of ~1100±180 years before Khvalynsk_EN lived. ( output:log )
And with all three Steppe_EN as test population we get a mean time of admixture of ~2820±1090 years before Steppe_EN lived, with same sources. ( output:log )
Above results tell us that Steppe_EN's profile is definitely older than that of Khvalynsk's. Khvalynsk postdates the Samara_HG sample and a similar kind of date is received from DATES. Even if I set Khvalynsk_EN & Iranian_related (both used as sources in your models for Steppe_EN) as source populations for Steppe_EN, I still get a mean time of admixture of ~2570±950 years before Steppe_EN lived.
"There are no successful models for Steppe_Eneolithic after removing Khvalynsk_En from the list of possible sources."
Yeah, we gotta wait for better Caucasus-related samples from North Caucasus as Lazaridis et al. suggests. Sarazm is not the real solution.
"Golubaya Krinitsa (from Allentoft et al preprint) is apparently similar to Khvalynsk samples (25% 'CHG' similar to Khvalynsk) and is dated to 5300BCE, much earlier than Steppe_en."
I've already explained why this is not a problem. Although I don't have access to samples of Golubaya_Krinitsa in qpAdm, I have G25 coordinates of two samples. One of them has very less CHG around 5-10%.
"In qpAdm we cannot have a reference or source (in this case steppe_en) which has received geneflow from the target (in this case khvalynsk)."
Yeah, I know about this rule but it's not being broken due to aforementioned reasons.
"Actually, the problem is in Caucasus-related source. Lazaridis et al. has categorically discussed this."
No, that is not the problem. The models work with Khvalynsk_En but not with EHG (have already shown this, and rotating result file is provided).
" we get a mean time of admixture of ~1100±180 years...
for Steppe_EN, I still get a mean time of admixture of ~2570±950 years before Steppe_EN lived."
Without getting into if the Dates calc is correct or not.
At 95% confidence, that formation date of 5140 to 5860BCE for Khvalynsk, a nice tight range.
4920 to 8720BCE for Steppe_en, a very wide almost useless range.
"Above results tell us that Steppe_EN's profile is definitely older than that of Khvalynsk's."
Nothing in this calculation tells you that steppe_en is 'definitely older'. The lower range of steppe_en (4920BCE) is younger than that of Khvalynsk (5140BCE). In fact, the wide std error makes the output useless.
"Yeah, we gotta wait for better Caucasus-related samples from North Caucasus as Lazaridis et al. suggests. Sarazm is not the real solution."
Sarazm is not the real solution is already known because of its date (3600bce). The fact is that khvalynsk and steppe_en require an iranN + ANE ancestry that can only be provided from the east, and not from caucasus. So the answer obviously lies in central and SC asia.
While we are at the topic of DATES, which imo gives variable results based on sources used, Chintalapati et al has their whole paper based on DATES. https://doi.org/10.7554/eLife.77625
"To understand the timing of the formation of the early Steppe pastoralist-related groups, we applied DATES using pooled EHG-related and pooled Iranian Neolithic farmer-related individuals. Focusing on the groups with the largest sample sizes, Yamnaya Samara (n=10) and Afanasievo (n=19), we inferred the admixture occurred between 40 and 45 generations before the individuals lived, translating to an admixture timing of ~4100 BCE (Supplementary file 1B). We obtained qualitatively similar dates across four Yamnaya and one Afanasievo groups, consistent with the findings that these groups descend from a recent common ancestor (we note for the Ozera samples from Ukraine, the dates were not significant). This is also further supported by the insight that the genetic differentiation across early Steppe pastoralist groups is very low (FST ~ 0.000–0.006) (Supplementary file 2H). Thus, we combined all early Steppe pastoralist individuals in one group to obtain a more precise estimate for the genetic formation of proto-Yamnaya of ~4400–4000 BCE"
Their paper does not give such old admixture dates for the EHG + Iran admixture. Clearly bulk of the admixture occurred post 4500BCE. Note that they have been able to reduce the 95% (2 std dev) range to just 400yrs, making it more reliable.
The above result (from the people who developed DATES) is agreeing with what I have written in this post. 1st wave of IranN into Khvalynsk (and Gobulaya Krinitsa). Second wave occurs much later, into steppe eneolithic. Average admixture date of the two admixtures thus falls between 4400-4000BCE.
Finally, it must be noted that Lazaridis et al could not find a single successful model for Steppe_En. I have been searching for successful models for 3 years now, and I claim bluntly that no reasonable rotating model will work without Sarazm ancestry in Steppe_en (from all currently available samples)
Lazaridis state:
"Neither the Steppe Maykop nor the Eneolithic of the piedmont of the North Caucasus (Progress-2) fit the 3-way model. An examination of the outlier f4-statistics of the model (“dscore” lines of qpAdm output) indicates that the model underestimates shared genetic drift with both the Levantine PPN
Z=-5.6) and the Siberian Upper Paleolithic represented by AfontovaGora3 in the outgroup set (Z=-3.4) for the Steppe Maykop population; the same is also true for the Piedmont Eneolithic (Z=-3.8 for AfontovaGora3). This suggests the presence of “Siberian” ancestry in the Eneolithic steppe, as previously observed.(17)"
Ok, so its clear to them too that some ANE is required. Did this ANE come from the caucasus to the south? or from its east? Of course the east makes most sense.
Furthermore, if one removes Sarazm from rotating source list, it doesn't mean that EHG + CHG + ANE will work for steppe_en. I guarantee that it wont with a credible setup, I have tried many times. Only after 3 years I have finally come to the conclusion that SC asian region is the only possible solution to this problem, a region that the authors of Lazaridis et al won't even deign to consider.
@mzp1 and @vasistha I wonder what type of evidence could be used to a spread of IE languages through the silk road. The number 7 and the symbol for the spoke wheel is found in the Indus seals. But the tradition we know from Indus signs, with the typical written symbols, unicorn, etc, begins at 2300BC, 300 years or so after the beginning of mature Harappa. It is contemporary with the raise of Akkadians in Sumer. So, perhaps a strengthening of Semito IE alience of sorts in dominating the silk road?
What are these things that related Germanic to Iranian but not Indic?
iirc, weren't there samples from Khvalynsk with extra WSHG? Some clade of haplogroup Q was reported too I think.
Yes multiple Q y haplos have been found at khvalynsk
interesting, could they be some siberians who tagged along with a Sarazm like population which settled in the steppe? I don't know about SCA Y haplogroups much.
Central asian, also present in few SC asian samples. J1a2 at Khvalynsk is from SC asia. I think that we will find some sort Sarazm-ANE cline in kazakhstan (at least in south and central) between 4000-3000bce.
> Central asian, also present in few SC asian samples.
oh ok didn't know Q came down that south.
However the haplogroups bring up the issue of R1b and R1a. How did eastern european paternal groups dominate compared to SCA haplogroups. And is R1b-M269 eastern european or not?
R1b-m269 is eastern European yes.
"How did eastern european paternal groups dominate compared to SCA haplogroups."
Lazaridis says here https://twitter.com/iosif_lazaridis/status/1563953743535685637?s=20&t=4nk3r-fZh-qpqe44cC807g
"he simple CHG-EHG model gives a CHG estimate of:
51.9+/- 1.3% (autosomes)
34.2+/- 8.5% (chrX)
In other words, the evidence is (2.1 s.e.) in favor of male CHG bias and _not_ the opposite"
We we can assume that between 5000 & 3100BCE (formation of yamnaya), R1b somehow took over.
"The models work with Khvalynsk_En but not with EHG"
But in models with Khvalynsk, they utilize a Southern source with significant Anatolian/Levantine ancestry. This kind of ancestry is also rejected by Lazaridis for Steppe_EN, is it not? I'm confused here.
"4920 to 8720BCE for Steppe_en, a very wide almost useless range."
Why is it useless? Narasimhan et al. gave a similar range for Iran_N + AASI admixture event in Indus_Periphery samples. I can probably narrow the range down by adding more EHG samples...
"The fact is that khvalynsk and steppe_en require an iranN + ANE ancestry that can only be provided from the east, and not from caucasus. So the answer obviously lies in central and SC asia."
"However, what is clear is that ancestry with a clear affinity to Iranian/SC Asian populations entered this region (and not CHG related)."
Khvalynsk_EN = 79% EHG + 21% Sarazm, from your model. This means only 2% CHG input and 15% of Iran_N input into Khvalynsk, correct? This means we should see a greater Khvalynsk_EN-Iran_N affinity than Khvalynsk_EN-CHG affinity, right? But this isn't the case. f4-statistic output never shows a significant Z-score for geneflow/affinity between Khvalynsk_EN and Iran_N, but shows for Khvalynsk_EN and CHG.
Chimp.REF Russia_Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002061 -4.908 38136 39586 703656
Chimp.REF CHG Russia_Khvalynsk_Eneolithic Iran_GanjDareh_N -0.000203 -0.465 39443 39586 703656
Chimp.REF Iran_GanjDareh_N Russia_Khvalynsk_Eneolithic CHG 0.001858 4.266 39443 38136 703656
"Their paper does not give such old admixture dates for the EHG + Iran admixture."
Their paper doesn't even give admixture dates for Steppe_EN and other eneolithic samples, and guess what they don't even include CHG in their Iranian_related pool but samples like Aigyrzhal_BA which are a mix of various ancestries. EHG and CHG have to mix way earlier than 4400-4000 BCE for profiles like Golubaya Krinitsa, Khvalynsk, Middle Don and Steppe_EN to form.
Lazaridis even states in supplementary that by 4500 BCE Yamnaya (not Proto-Yamnaya) is already formed with the latest ancestry arrival being Southern Arc ancestry. So obviously EHG + CHG (proto-Yamnaya/Steppe_EN like) mixture has to be older, right?
"Yamnaya cluster individuals from Russia admixed 63.7±10.6 generations (1,785±297 years) prior to the time they lived using EHG or Eneolithic individuals from Russia as one source and Chalcolithic individuals from Armenia, Azerbaijan, Iran, and Turkey in the Southern Arc as the other; their average radiocarbon date is 2770BCE, placing admixture to the mid-5th millennium BCE."
"Lazaridis et al could not find a single successful model for Steppe_En."
Yes, this is true but Lazaridis believes the unsampled CHG rich ancestor and a distinct ANE wave can explain Steppe_EN formation better.
"Only after 3 years I have finally come to the conclusion that SC asian region is the only possible solution to this problem, a region that the authors of Lazaridis et al won't even deign to consider."
Fair, I suppose better sampling in both North Caucasus and SC Asian region can help us conclude this issue.
From the f4 results I'm quite certain that the Sarazm model passes due to ANE presence, Iran_N doesn't really mean much for it as it prefers CHG in f4. You can see the same in G25.
Target: RUS_Khvalynsk_En
Distance: 3.4709% / 0.03470901
70.6 EHG
20.8 CHG
7.6 WSHG
1.0 TUR_Marmara_Barcin_N
0.0 IRN_Ganj_Dareh_N
0.0 Levant_PPN
0.0 SRB_Iron_Gates_HG
Target: Russia_Don_Neolithic_Golubaya_Krinitsa_5000BC:I12491
Distance: 2.6899% / 0.02689914
60.2 EHG
21.0 CHG
16.0 SRB_Iron_Gates_HG
2.8 TUR_Marmara_Barcin_N
0.0 IRN_Ganj_Dareh_N
0.0 Levant_PPN
0.0 WSHG
Target: Russia_Don_Neolithic_Golubaya_Krinitsa_5000BC:I12490
Distance: 2.3971% / 0.02397121
59.6 EHG
30.2 SRB_Iron_Gates_HG
5.8 CHG
3.2 WSHG
1.2 TUR_Marmara_Barcin_N
0.0 IRN_Ganj_Dareh_N
0.0 Levant_PPN
CHG's case is also stronger because Samara_HG has some CHG in it probably from North Caucasus CHG-like people.
"Finally, 3 Eneolithic individuals from Khvalynsk II in the Samara region of Russia(9) dated to 5000-4500BCE, a territory in which Yamnaya culture would later appear can be modeled on average as having mostly EHG ancestry, but also 21.5±1.7% CHG ancestry. These individuals postdate an EHG hunter-gatherer from Samara (Lebyazhinka IV)(8) by a thousand years. When we model this hunter-gatherer as a mixture of CHG and the two remaining EHG hunter-gatherers from Karelia(8), the resulting model fits (p=0.08) and assigns a non-significant 3.7±2.4% proportion of CHG ancestry."
This also explains the presence of J1 haplogroups in other EHGs.
The Karelia_HG + CHG model for Samara_HG even works on G25. Here too, CHG is preferred over Iran_N.
Target: RUS_Samara_HG
Distance: 2.7997% / 0.02799681
93.8 RUS_Karelia_HG
6.2 GEO_CHG
0.0 IRN_Ganj_Dareh_N
This f4 statistics run explains things very nicely.
result: Chimp.REF WSHG CHG Russia_Khvalynsk_Eneolithic 0.012597 24.075 37601 30009 602680
result: Chimp.REF CHG WSHG Russia_Khvalynsk_Eneolithic 0.002363 5.140 31433 30009 602680
result: Chimp.REF Russia_Khvalynsk_Eneolithic WSHG CHG -0.010234 -19.068 31433 37601 602680
result: Chimp.REF WSHG Iran_GanjDareh_N Russia_Khvalynsk_Eneolithic 0.013782 29.049 38152 29915 597714
result: Chimp.REF Iran_GanjDareh_N WSHG Russia_Khvalynsk_Eneolithic 0.001259 3.212 30667 29915 597714
result: Chimp.REF Russia_Khvalynsk_Eneolithic WSHG Iran_GanjDareh_N -0.012523 -25.028 30667 38152 597714
result: Chimp.REF Russia_Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002061 -4.908 38136 39586 703656
result: Chimp.REF CHG Russia_Khvalynsk_Eneolithic Iran_GanjDareh_N -0.000203 -0.465 39443 39586 703656
result: Chimp.REF Iran_GanjDareh_N Russia_Khvalynsk_Eneolithic CHG 0.001858 4.266 39443 38136 703656
As you can see, CHG and WSHG/ANE are much more important than Iran_N for Khvalynsk_EN.
Yeah well, I myself thought till that Khvalynsk had CHG before I did the analysis for this post. But sadly, no qpAdm models with CHG pass. Even if you remove Sarazm from sources, only the models with Seh_Gabi_C pass. This goes to show that G25 cannot be relied upon (this was already known, its an informal tool and coordinates have been developed based on an unknown and unpublished set of references - prone to bias). So when rebutting me, kindly use qpAdm (rotating model setup), not G25.
The only label which needs excess CHG is Steppe_eneolithic which is just north of the Caucasus range and is quite close to the country of Georgia where CHG was found.
As far as the J1 is concerned, the J1 at khvalynsk is actually J1a2 (j-CTS1026 as per anthony et al 2022). the J1 in Karelia is irrelevant as karelia_HG doesnt have any iran related ancestry, and neither is the khvalynsk J1a2 a descendent of the Karelia J1.
The Karelia example also goes to show how Y haplogroups can travel without any detectable autosomal impact.
@Legend
"result: Chimp.REF WSHG CHG Russia_Khvalynsk_Eneolithic 0.012597 24.075 37601 30009 602680"
Stop using F4 like this, it doesnt make any sense. WSHG has natural high affinity to the EHG in khvalynsk. after all EHG is ~50% ANE.
"As you can see, CHG and WSHG/ANE are much more important than Iran_N for Khvalynsk_EN."
Lol. You should really try to understand what those F4 mean. There are multiple reasons for high affinity, mostly to do with deep ancestry. I think now you are trying to use the 'Baffle with bullshit' strategy.
"result: Chimp.REF Russia_Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002061 -4.908 38136 39586 703656"
This shows higher affinity beteen CHG and Khvakynsk than between IranN & khvalynsk. Before you go declaring your victory, know that
f4(Chimp, EHG, CHG, IranN) is also significantly negative.
This again has to do with deep ancestry in CHG and EHG, not because there is CHG ancestry in EHG.
See this
Mbuti.DG EHG CHG Iran_GanjDareh_N -0.002144 0.000368 -5.834
Mbuti.DG Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002040 0.000381 -5.348
The Laziridis southern arc paper makes the CHG/IranN point very clear. So please don't quote that paper to claim that only CHG works but IranN doesn't. They don't even try to model with IranN for most of the labels.
"These five sources should not be unduly emphasized beyond their utility as a descriptive convenience because (i) they could be swapped for related ones [e.g., Neolithic Iran captures much of the same deep ancestry as Caucasus hunter-gatherers do"
The only way to prove your case is to model Khvalynsk with CHG in a credible rotating qpAdm setup. qpAdm deals with multiple F4 stats to arrive at a conclusion about the model provided. Do that, don't try to baffle me with BS.
F3 admixture test for Khvalynsk_EN. All 4 sources X=CHG/IranN/SehGabiC/Sarazm give significant -ve F3 stats for F3(EHG,X;Khvalynsk)
Results here. Also added it to blogpost.
Of course, qpAdm narrows it down to SehGabiC and Sarazm. CHG and IranN are rejected in qpAdm.
Ohh wait, I didn't think it like this. Why didn't they use Iran_N? Lol this makes no sense. Iran_N has to be there in models. Using CHG alone is just stupid.
"WSHG has natural high affinity to the EHG in khvalynsk. after all EHG is ~50% ANE."
Can it not differentiate between EHG and WSHG? Yeah they have ANE but the genetic drift must be massive, no?
"The only way to prove your case is to model Khvalynsk with CHG in a credible rotating qpAdm setup"
The issue is, it runs 30,000 models with that rotating list. Is there any way to fix number of sources to 2/3/4?
"Can it not differentiate between EHG and WSHG? Yeah they have ANE but the genetic drift must be massive, no?"
result: Chimp.REF WSHG CHG Russia_Khvalynsk_Eneolithic 0.012597 24.075 37601 30009 602680
Just look at the massive Z-score of 24. You get it because EHG is formed from 50% ANE & WSHG is also almost fully descended from ANE. F4(chimp, wshg, chg, khvalynsk) still shows massive shared drift between wshg and EHG, even after non shared drift.
F4 cant differentiate, thats what qpAdm is for, to balance all F4 stats so that there is no overshooting in either direction with respect to the references.
"The issue is, it runs 30,000 models with that rotating list. Is there any way to fix number of sources to 2/3/4?"
Just run them with details=false, much faster, get only the p-values. Later you can run the successful models only with details =true.
This assumes that the chg and iranN that mixed with ehg were all J2 or some other non R1 haplogroup.
Considering that we have found that iranjans and Kurds have many basal r1 clades and their tmrca is very old predating sintahsta or yamanay,it could be possible that those chg and iranN themselves had some subclades of r1a and r1b.
And it makes sense considering r1a and r1b is from ANE and both chg and iranN is themselves 50% ANE.
Absence of evidence is not evidence of absence
@Freakk
"Considering that we have found that iranjans and Kurds have many basal r1 clades"
Which basal R1 clades in iranians and Kurds?
Post a Comment