Wednesday, September 28, 2022

Exploring the sources of the 'Southern ancestry' in the Steppe



The proximal source of the southern (IranN/CHG) ancestry in the Eneolithic Steppe as well as Yamnaya populations has been a mystery. Chintalapati et al 2022 infer that the admixture between Eastern European Hunter-Gatherers and Iran Neolithic ancestry occurred between 4400-4000BCE, using the DATES algorithm developed by Moorjani Lab.

In this post, powered with newly published samples as well as better tools than before, I will attempt to decipher the true sources of these populations.

Disclaimer: The analysis presented below is based on the currently available published samples from the Pontic Caspian Steppe. The samples from the Middle Don region (Allentoft et al 2022 preprint) are unavailable, and this analysis will be updated once those samples become available.



 



The Parameters of the Rotating qpAdm models


Admixtools2 running on Windows R (RStudio)
F2 statistics computed using extract_f2() function, maxmiss = 1 (akin to allsnps: YES), adjust_pseuodhaploid = TRUE.

The references (aka 'right populations' or 'right pops') for the rotating models will consist of:

1. Fixed right pops: These will be always present in the right pops but will not be used as sources.
2. Rotating sources/references: These will be used as sources (a combination of N sources at a time), and when they are not in the source they will be used as references for qpAdm.

For the analysis that follows:

1. Fixed Right pops:  

Mbuti.DG, CHG_Satsurblia, Mongolia_North_N, ONG.SG, IRQ_PPN, PPN, Russia_AfontovaGora3

2. Rotating Sources: 

Iran_GanjDareh_N, Serbia_IronGates_Mesolithic, WSHG, Turkey_N, EHG, Iran_C_SehGabi, ARM_Aknashen_N, ARM_Masis_Blur_N, Iran_HajjiFiruz_N, CHG_Kotias, Tajikistan_C_Sarazm, Azerbaijan_lowlands_LN, Hun_Vinca_MN, Ukraine_N, Russia_Caucasus_Eneolithic 

The above will be used for modelling Khvalynk_En. For modelling Steppe_Eneolithic, "Khvalynsk_En" will be added as a rotating source/reference. For modelling Yamnaya_Samara, "Steppe_Eneolithic" will also be added further as a rotating source/reference.

P-value cutoff is kept at > 0.05. Weights and standard errors are only calculated if p > cutoff AND all weight coefficients are positive

Results of all the rotating models are present in google sheets list format here.


Working Models for Khvalynsk_Eneolithic


Khvasynsk II lies in the Samara Oblast. The previous sample from this region is labelled Samara_HG (from Lebyazhinka IV), and its ancestry is ~100% European Hunter-Gatherer (EHG). The carbon date of this sample is ~5600BCE. The adjusted carbon dates for the Khvalynsk samples lie around 4400BCE.

The significantly negative F3 stats suggest that Khvalynsk_En is an admixture between EHG & any of the 4 CHG/IranN-related populations listed in the output below.

Khvalynsk F3
-ve F3 stats for admixture test F3(EHG, X; Khvalynsk_En)


qpAdm Models


2 source models for Khvalynsk_EN shows 21% Sarazm like ancestry
2 source models for Khvalynsk_En


The only acceptable external sources for Khvalynsk_En are Tajik_Sarazm_En (21%) and Iran_SehGabi_C (18%).


Issues with carbon dating


The previous sample from this region is labelled Samara_HG (from Lebyazhinka IV), and its ancestry is ~100% European Hunter-Gatherer (EHG). The carbon date of this sample is ~5600BCE, however, I am not aware if this has been adjusted for the aquatic reservoir effect as it has not been mentioned in the related supplementary materials.

The reservoir effect occurs due to the fish diet consumed by humans, which elevates the δ15N isotope in the human bones. This has the effect of giving radiocarbon dates which are older than reality. To correct for this, herbivore/charcoal and aquatic bones from the same context as humans are carbon-dated and based on this the calibration is done. The bones of the herbivorous animals/charcoal will give the most accurate date as they are unaffected by the reservoir effect.

As per Shislina et al 2018, the reservoir effect at Lebyazhinka V is around 730 years. Lebyazhinka IV (context in which Samara_HG was found) is the older layer, and logically, a similar reservoir effect should apply to it. So it is indeed possible that the sample Samara_HG actually dates to around 5000BCE instead of ~5600BCE. I am unaware if there are papers which adjust reservoir effects for Lebyazhinka IV, if anyone does know, kindly inform me in the comments.

Similarly, for the 3 Khvalynsk II samples from Samara Oblast, the metadata gives the date of ~5000-4700BCE. However, post reservoir effect correction, the correct date of the Khvalynsk II samples is ~4400BCE (Wilkin et al 2021).


The Middle Don Eneolithic samples from Allentoft et al 2022 preprint


NEO113 & NEO212 from Golubaya Krinitsa (unpublished) possess ~25% CHG as per the preprint, the rest being EHG. So, these seem very similar to the Khvalynsk II samples. The dates of these samples post reservoir effect correction come to ~5300BCE (Allentoft et al supplement excel Table III). I am not sure if these corrections are based on C:N graphs alone or if actual herbivore bones from the same human context were also carbon-dated to provide the most accurate dates. My doubt comes because the reservoir effect for these samples is ~250 yrs which is much lower than the Progress samples (~700 years) and the Khvalynsk II, Lebyazhinka V samples (400-800 yrs). So it may be that the dates of these samples come to ~5000BCE, however, I am not competent enough to comment further on this matter.

For this analysis, I am assuming that the Golubaya Krinitsa samples have the same sources of ancestry as Khvalynsk II (or that these people are the same ones who migrated north to Khvalynsk). However, when the actual samples are published and later analyzed, this may or may not prove to be true. I will revisit this article once the Allentoft et al samples are published.

What this suggests is:


1. Sarazm_En-like ancestry first entered Golubaya Krinitsa before 5300BCE (maybe later if the reservoir effect wasn't calibrated accurately and was underestimated)

2a. Sarazm_En-like ancestry first entered Khvalynsk between 5000BCE (absent in Samara_HG, refer to the above note about reservoir effect), or

2b. Migrants from elsewhere with EHG + Sarazm_en ancestry entered the Khvalynsk region post 5000BCE. 

More data from various periods and regions will clarify when exactly this ancestry entered the region. However, what is clear is that ancestry with a clear affinity to Iranian/SC Asian populations entered this region (and not CHG related). This can also be shown through uniparental markers (discussed later).


Working Models for Steppe_Eneolithic


Steppe_Eneolithic includes two samples from Progress II and one from Vonjucka in south Russia. They are dated to around 4250BCE, after correcting for ~700yrs of reservoir effect (Wang et al 2019). One of the humans from Progress, PG2001, shows clear signs of (sheep) dairy consumption as per Scott et al 2022.

Note that Khvalynsk has been used to model Steppe_Eneolithic and not vice versa. There are 3 reasons for this.

a. The Khvalynsk samples are older. Eg, Wilkin et al 2021 supplement states (emphasis mine)

These ruminant grazers would not have been subject to the reservoir effect, with dates of 4450–4355 cal BCE from Khvalynsk I and 4448–4362 cal BCE from Khvalynsk II. These two dates provide the best estimate currently available for the true age of the two cemeteries.

 Whereas the 3 Steppe_Eneolithic samples are dated in Wang et al 2019 supplement as -

    1. PG2001: charcoal 4336-4173 cal BCE 
    2. PG2004: 4233-4047 cal BCE
    3. VJ1001 with 2 dates (unknown where each came from, usually the younger one is without reservoir effect): Dating: 4332-4238 cal BCE & 4229–4065 cal BCE

 Each of these three dates is younger than the two secure dates at Khvalynsk I & II.

 

b. There are no successful models for Steppe_Eneolithic after removing Khvalynsk_En from the list of possible sources. No 4 or even 5-source models with EHG as source instead of Khvalynsk_En provide p-values > pcutoff.

 

c. Golubaya Krinitsa samples with 25% 'CHG' ancestry as claimed by Allentoft et al are dated to 5300BCE. These seem similar to Khvalynsk samples rather than Steppe_Eneolithic, and therefore the Khvalynsk profile should be assumed to be older.


Two, three source models do not work for Steppe_Eneolithic. So we present the output for all 1820 4-source models, ordered by (feasible=TRUE) p-values

Steppe_En 4 source models
Steppe_En 4 source models

In all the passing models, 3 sources are constant - Khvalynsk_En, CHG & Sarazm_En. The 4th source varies but the common aspect is that those 4th sources provide minor Anatolian/Levantine ancestry.

We see that almost half of the ancestry is from Khvalynsk_En, 20% from CHG and 15-20% from a Sarazm-like population. The rest is contributed by a source heavy in Anatolian/Levantine admixture. This is perfectly in harmony with the informal G25 models for Progress_En, except that for some reason the additional minor Anatolian/Levantine admixture is not detected here.

G25 Progress
G25 model for Progress_En

Is there a neolithic source which mimics CHG + Sarazm_En ancestry, and therefore can become a single source which contributed 35-40% ancestry to Steppe_Eneolithic? Such a source hasn't been found in the south of Caucasus yet. However, SC Asian region is very poorly sampled, the oldest samples are from 3600BCE Sarazm. So the possibility of such a population existing in the region is high. Allentoft et al will soon publish a 4700BCE sample from Monjukli Depe (Turkmenistan, Jeitun Culture), which purportedly can be modelled as 60% IranN + 35% CHG + 5% ANE (Allentoft et al supplement). 5500-5000BCE samples from the region may become a perfect source as well for Khvalynsk, Steppe_En and Yamnaya.

The other option is that CHG + Sarazm_En actually represent two separate population movements into the steppe region from 2 different regions. The datapoint which favours this scenario is that Khvalynsk can be simply modelled as EHG + Sarazm without the need for CHG ancestry. A lot more data from different regions and periods (especially from SC Asia and even the central steppe) is required to figure this out.

The legitimacy of using Sarazm_EN as a source?


Sarazm_En represents the oldest samples (~3600BCE) from the SC Asian region. Since they have been used as a source for the (older) steppe populations above, a legitimate question to ask is - What if the affinity seen between Sarazm and Steppe_En is due to gene flow from Steppe into Sarazm rather than the other way round? 

The direction of gene flow is easy to prove. Distal modelling of Sarazm_En shows that the samples need no input from EHG or Khvalynsk_En-related populations (These 2 are part of the references). The same was concluded in the distal modelling done by Narasimhan et al 2019. Since no EHG-rich steppe source is required, this means that Sarazm_En represents a distinct and independent southern population which can legitimately be used as a proxy to represent the 6000-5000BCE SC Asian region while modelling Steppe populations. More samples from SC Asia across periods are required to conclude this chapter of finding the true source for the southern ancestry in the Steppe.

Sarazm 4-source models
Sarazm 4-Source models


Working Models for Yamnaya_Samara

Finally, the youngest of the 3 steppe populations analyzed in this post is Yamnaya_Samara. The best 3-source model gives only a marginal p-value of 0.04, so I show all 4-source models below.


Yamnaya 4-source models
Yamnaya 4-source models

It becomes immediately clear that the only passing models need Steppe_Eneolithic as a source. This suggests that Steppe_En or a population very similar to it is a true source for Yamnaya_Samara. We also see some additional Anatolian heavy sources from the south of Caucasus (7-15% ARM_N or HajjiFiruz_N). The 3rd source required is something with both EHG and WHG ancestries (something similar to Ukraine_N samples). 14-20% ancestry of Yamnaya is derived from such a source which lies on the EHG-Serbia_IronGates cline.


The Uniparental Markers

At Khvalynsk, the presence of mtDNA H13a2a (found at Tepe Anau, and later Pakistani and Indian samples), and Y Hg J1-CTS1026 (J1a2; J1a2a/b derived subclades are found all over Eneolithic SC Asia and later) provide some connection between the 2 regions. mtDNA W3a1 (first found at Anau, sister clade W3a2 found at Namazga Tepe, and daughter clade W3a1b in Indus Periphery samples) is found at various steppe populations - Yamnaya (W3a1a), CordedWare(W3a1c), and various Bell beaker samples. mtDNA J1b1a1 (first found at Geoksyur) is also found at various Corded Ware-related sites, and at Sintashta. 

The presence of SC Asian uniparental markers in the steppe provides additional support to the idea of autosomal gene flow into the steppe.

SC Asian Impact on Neolithization of the Steppe?


The Jeitun culture in Turkmenistan had already domesticated goats, sheep and cattle and used agriculture by 6000BCE (Taylor et al 2021). Interestingly, the Khvalysnk site shows the oldest unambiguous evidence of domesticated goats and sheep in the region (Anthony et al 2022). Domesticated livestock was possibly introduced into the region from SC Asia via human-mediated migration. More research is required on this topic.



Budd et al 2020 state thus

Whilst domesticates are integrated into Bug-Dniester subsistence strategies from the end of the VIIth millennium BC in southwestern Ukraine, the earliest reliable isotopic evidence for a reliance on terrestrial domesticates further east, in the Dnieper region, only occurs at the onset of the Eneolithic period, with the Trypillia farming culture, and in the Dnieper River region at the site of Molyukhov Bugor at c. 4000 calBC.
So there was no animal domestication present till 4000-3500BCE in the Dnieper region of Ukraine, which includes the sites of Dereivka (later part of Sredny Stog Culture). Animal domestication was only present in SW Ukraine in the Bug-Dniester region. The ancestry of this region since the mesolithic period is of the Ukr_N type, an admixture between EHG and WHG. This ancestry is different from that of the Volga Samara region which lacks any WHG ancestry. Therefore, the spread of animal domestication from the west into the Volga region doesn't make too much sense. Even the site of Golubaya Krinitsa (which has 2 outliers with IranN ancestry as per Allentoft et al) shows no signs of domesticated animals by 5300BCE [Allentoft et al supplement - fossil sea shells, Unio shells and products from their wings, bone decorations (wild boar fangs, beaver teeth and groundhogs), bone tools have been found]. Moreover, the Golubaya Krinista site has been classified as a Mariupol-type site showing similarities with the Dnieper culture [Dereivka related] and not with the Volga sites.


At the same time, There are other sites in the lower Volga region which show the presence of domesticated sheep. These are 4500BCE Kairshak VI and 5000-4700BCE Oroshaemoe (Vybornov et al 2018). These sites are said to be part of the Cis-Caspian cultures which are succeeded by the Khvalinian culture. Some remains of domesticated animals are also seen near upper Volga sites like Ekaterinovka Mys ~4700-4500BCE (Wilkin et al 2021). 

The source of domestication at these 4 sites (Khvalynsk, Orishaemoe, Kairshak, Ekaterivoka Mys) cannot be from the west (Don or Dnieper regions as they lack domestication). South & east (or SE) remain viable sources. But qpAdm models show the presence of Sarazm-like ancestry at Khvalynsk (as opposed to CHG), indicating that an SE source (Jeitun or Kelteminar cultures) brought ancestry as well as domestication to the Volga region.



Steppe domestication
Steppe Neolithic/Eneolithic sites - Domestication status of Sheep/Goat




Eastern Caspian Pottery in Western Steppe till Baltic


Andreev & Vybornov 2021 note,

Early pottery on the territory from the Eastern Caspian Sea and Aral Sea to Denmark reveals a certain typological similarity. It is represented by egg-shaped vessels with an S-shaped profile of the upper part and a pointed bottom.

 

They conclude thus

From our point of view, a certain morphological and technological similarity of the earliest ceramic ware from the Central Asian area to the southwestern shores of the Baltic Sea is not associated with stage-bystage phenomena. Apparently, it reflects the process of the spread of skills in pottery making over a considerable distance as a result of climatic and economical factors. In our opinion, the transfer of these skills occurred as a result of the interaction of representatives of multicultural groups with a complex hunter-gather economy in border areas and migrations of small groups of the population.


The authors conclude that the pottery found at the sites of Ayakagitma, Dzhebel (Jebel), and Lyavlyakan in Uzbekistan, approx earliest dated to ~6100BCE made its way to Elshankaya (Lebyazhinka IV at Samara, followed by Samara culture where Samara_HG sample was found). The authors further conclude that this type of pottery spread from east of Caspian to as far as the shores of the Baltic sea, anddoeso not rule out small migrations.

Interestingly, note that Lasota-Moskalewska et al (2014) proposes Ayakaitma (6000-5400BCE) as the earliest site of horse domestication, and assert that 30-40% of animal remains belong to the Equidae family, some of them being horse bones, rest belonging to domesticated livestock. At the very least, it proves that horses were present in this region. The authors were part of the Polish-Uzbek Archaeological Mission.

A much more detailed analysis of the steppe neolithic pottery has been done by FrankN here. He comes to similar conclusions.


CONCLUSION

The analysis done in this article suggests either a continuous or multi-wave geneflow from SC Asia into Steppe from 5500-4000BCE. One such option is hypothesized below. This geneflow may have followed the spread of pottery from central Asia and may have played a part in the neolithization of the western steppe. The source of this southern ancestry in the Eneolithic steppe and Yamnaya is important to locate the Proto-Indo-European homeland, as argued by Lazaridis et al 2022.

Migration from SC Asia to Steppe
Hypothesized migration from SC Asia to Steppe before the Eneolithic



Final Ancestry Sources for steppe
Most optimal models for various Steppe populations



REFERENCES


1. Allentoft, M. E., et al. "Population Genomics of Stone Age Eurasia". bioRxiv. Retrieved September 28, 2022, from https://www.biorxiv.org/content/10.1101/2022.05.04.490594v4 

2. Shishlina, N I, et al. “The Lebyazhinka Burial Ground (Middle Volga Region, Russia): New 14C Dates and the Reservoir Effect.” Radiocarbon, vol. 60, no. 2, 2018, pp. 681–690., doi:10.1017/RDC.2017.94.

3. Wilkin, S., Ventresca Miller, A., Fernandes, R. et al. Dairying enabled Early Bronze Age Yamnaya steppe expansions. Nature 598, 629–633 (2021). https://doi.org/10.1038/s41586-021-03798-4

4. Taylor, W.T.T., Pruvost, M., Posth, C. et al. Evidence for early dispersal of domestic sheep into Central Asia. Nat Hum Behav 5, 1169–1179 (2021). https://doi.org/10.1038/s41562-021-01083-y

5. Anthony, D., Khokhlov, A., Agapov, S., Agapov, D., Schulting, R., Olalde, I. & Reich, D. (2022).  The Eneolithic cemetery at Khvalynsk on the Volga River. Praehistorische Zeitschrift, 97(1), 22-67. https://doi.org/10.1515/pz-2022-2034

6. Lazaridis I, Alpaslan-Roodenberg S, Acar A, et al. The genetic history of the Southern Arc: A bridge between West Asia and Europe. Science. 2022;377(6609):eabm4247. doi:10.1126/science.abm4247

7. Lasota-Moskalewska, Alicja et al. “A PROBLEM OF THE EARLIEST HORSE DOMESTICATION . DATA FROM THE NEOLITHIC CAMP AYAKAGYTMA ‘ THE SITE ’ , UZBEKISTAN , CENTRAL ASIA 1.” (2014).

8. Andreev, Konstantin Mikhailovich and Vybornov, Alexander Alekseevich. "Ceramic Traditions in the Forest-Steppe Zone of Eastern Europe" Open Archaeology, vol. 7, no. 1, 2021, pp. 705-717. https://doi.org/10.1515/opar-2020-0169

9. N, F. (2019, January 11). How did chg get into Steppe_emba ? part 2 : The pottery neolithic. Ancient DNA Era. Retrieved September 29, 2022, from https://adnaera.com/2019/01/11/how-did-chg-get-into-steppe_emba-part-2-the-pottery-neolithic/ 

10. Budd, C., Potekhina, I., Lillie, M. (2020) Continuation of fishing subsistence in the Ukrainian Neolithic: diet isotope studies at Yasinovatka, Dnieper Rapids Archaeological and Anthropological Sciences, 12(2): 64 https://doi.org/10.1007/s12520-020-01014-4

11. Vybornov, A, et al. “Diet and Chronology of Neolithic-Eneolithic Cultures (from 6500 to 4700 Cal BC) in the Lower Volga Basin.” Radiocarbon, vol. 60, no. 5, 2018, pp. 1597–1610., doi:10.1017/RDC.2018.95.

12. Wang, CC., Reinhold, S., Kalmykov, A. et al. Ancient human genome-wide data from a 3000-year interval in the Caucasus corresponds with eco-geographic regions. Nat Commun 10, 590 (2019). https://doi.org/10.1038/s41467-018-08220-8

13. Scott, A., Reinhold, S., Hermes, T. et al. Emergence and intensification of dairying in the Caucasus and Eurasian steppes. Nat Ecol Evol 6, 813–822 (2022). https://doi.org/10.1038/s41559-022-01701-6

14. Manjusha Chintalapati, Nick Patterson, Priya Moorjani (2022) The spatiotemporal patterns of major human admixture events during the European Holocene eLife 11:e77625

39 comments:

mzp1 said...

There is already a sample from that area and near that time, called ZamanBaba_N. But that is poor quality and Reich lab said it was 'contaminated' when I made them aware because it had lots of 'steppe ancestry'. If this new Turkmenistan sample is published it will be very interesting and useful.

What also interesting to me is that Iran_N and Steppe seem to form their own clusters but Sarazm, which seems so far from Iran and there are few Iran_N like settlements there but still seems to cluster with Iran_N and other farmers.

Ofcourse the difference between older (EHG) and newer (Steppe) populations in Eastern Europe is definitely coming from the South, rather than mixing and forming in Europe.

Giacomo Benedetti said...

I have not seen a reference to Hotu cave, can you include it in the analysis? It seems that there is an affinity of Hotu cave and the southern Urals, as said in the quotations here: http://new-indology.blogspot.com/2014/10/can-we-finally-identify-real-cradle-of.html
"the geometric microliths and points found in the mesolithic sites of the southern Urals are identical with the inventory of the remains found in Belt Cave, Hotu, Shanidar B, Karim Shahir, Zawi Chemi Shanidar, Jarmo and other sites in southwestern Asia - the area of the origin of domestication during the tenth to eighth millennium bc." "Taking into account that wild sheep are absent from the Urals and the surrounding areas, and that their region of origin was northern Mesopotamia and northern Iran, it can be assumed that stockbreeding was introduced to the Urals from Iran and the southern shores of the Caspian. The introduction of the 'southern' stockbreeding elements may date well back into the Mesolithic, possibly to the date of the appearance of the geometric microliths (ninth to seventh millennia bc)."

vAsiSTha said...

Hotu mesolithic sample is of low quality, not fit to be used as source.

Giacomo Benedetti said...

That's a pity... BTW, in Leiden during the debate Reich said that a good source for the steppe ancestry is the Caspian Neolithic, he said to the NW of the Caspian. He did not mention the sites, so I don't know exactly what he meant, but an eastern origin would be confirmed. Do you have any idea of the samples and sites involved?

Daniel de França MTd2 said...

In the present time, most of Turkmenistan's population is located near Iran's frontier. If the present population density has a similar distribution as that of entreolithic, and without borders, they should be identical to that of North Iran and East Caucasus. All of these regions belong to the same geologic formation, the Kopet Dag moutain range.

This region links trivially the caucasus with BMAC and yields routes for traders and generals, from Mongolians, Turks and Greeks, like Alexander, to move into China (Tocharians) and India. There are many, many rivers small that can support troops and travelers. But they funnel the routes, making it a very much disputed region of the silk road.

Indeed, the Silk Road was also the path for armies. You can see that Alexander followed exactly the paths of the Silk Road and conquering them meant conquering the regions in between.

Daniel de França MTd2 said...

The Kopet Dag isa part of the Alpide Belt, which is responsible for a large portion of the Silk Road. Perhaps this is what yield the spread of PIE: the need to trade ores to make bronze in thes moutains ranges

I think that if you look at this, it seems reasonable to see it as a conduit for IE languages. https://en.wikipedia.org/wiki/Alpide_belt

vAsiSTha said...

"BTW, in Leiden during the debate Reich said that a good source for the steppe ancestry is the Caspian Neolithic, he said to the NW of the Caspian. He did not mention the sites, so I don't know exactly what he meant, but an eastern origin would be confirmed. Do you have any idea of the samples and sites involved"

Nope, no clue.

@daniel

Yes the silk road was important even in the bronze age and earlier.

vAsiSTha said...


Added this section:

Eastern Caspian Pottery in Western Steppe till Baltic

Andreev & Vybornov 2021 note,

"Early pottery on the territory from the Eastern Caspian Sea and Aral Sea to Denmark reveals a certain typological similarity. It is represented by egg-shaped vessels with an S-shaped profile of the upper part and a pointed bottom."

They conclude thus

"From our point of view, a certain morphological and technological similarity of the earliest ceramic ware from the Central Asian area to the southwestern shores of the Baltic Sea is not associated with stage-bystage phenomena. Apparently, it reflects the process of the spread of skills in pottery making over a considerable distance as a result of climatic and economical factors. In our opinion, the transfer of these skills occurred as a result of the interaction of representatives of multicultural groups with a complex hunter-gather economy in border areas and migrations of small groups of the population."


The authors conclude that the pottery found at the sites of Ayakagitma, Dzhebel (Jebel), and Lyavlyakan in Uzbekistan, approx earliest dated to ~6100BCE made its way to Elshankaya (Lebyazhinka IV at Samara, followed by Samara culture where Samara_HG sample was found). The authors further conclude that this type of pottery spread from east of Caspian to as far as the shores of the Baltic sea, and do not rule out small migrations.

Interestingly, note that Lasota-Moskalewska et al (2014) proposes Ayakaitma (6000-5400BCE) as the earliest site of horse domestication, and assert that 30-40% of animal remains belong to the Equidae family, some of them being horse bones, rest belonging to domesticated livestock. At the very least, it proves that horses were present in this region. The authors were part of the Polish-Uzbek Archaeological Mission.

3rdacc said...

I remember reading this on adnaera a while back: https://adnaera.com/2019/01/11/how-did-chg-get-into-steppe_emba-part-2-the-pottery-neolithic/

Author shows archaeological evidence of a south and east of caspian influence on the steppe.

vAsiSTha said...

Yes, a very good article by FrankN

Daniel de França MTd2 said...

@Vasistha what I mean it is that the silk road was what transmitted the langauge group, not merely just important...

Daniel de França MTd2 said...

@Vasistha It is possible PIE was in a choke point of the silk road and become the lingua franca of the whole route and thus spread.

Legend said...

The analyses are very insightful as always! However, I think you should try adding 'Russia_Steppe_Eneolithic' in leftright list aka 'rotating sources' because Steppe_EN profile is contemporary to Khvalynsk and it deserves a spot. I ran the rotating models on ADMIXTOOLS2. Click here to view the results in google sheets.

Turns out, Steppe_EN beats Sarazm. I also ran Steppe_Eneolithic and Sarazm models in the original ADMIXTOOLS.

Khvalynsk_EN w/Steppe_EN (p-value = 0.735387)
Khvalynsk_EN w/Sarazm_C (p-value = 0.006064)

Now let's check gendstat of "Khvalynsk_EN w/Sarazm_C". The only gendstat with a significant Z-score shows that "Russia_Steppe_Eneolithic" is lacking in the model.

gendstat: Russia_Steppe_Eneolithic Ukraine_N 3.014

vAsiSTha said...

@Legend

"However, I think you should try adding 'Russia_Steppe_Eneolithic' in leftright list aka 'rotating sources' because Steppe_EN profile is contemporary to Khvalynsk and it deserves a spot."

This is not correct.

We have to select one of the populations to be the ancestor of the other. We cannot model Khvalynsk with Steppe_En as a source while at the same time model Steppe_En with Khvalynsk as a source. Given that both of these are extremely closely related populations, when provided as a choice Khvalynsk will indeed select Steppe_en as a source. This can also be seen via G25

Target: RUS_Khvalynsk_En
Distance: 2.2793% / 0.02279283
50.4 RUS_Karelia_HG
47.4 RUS_Progress_En
2.2 RUS_Samara_HG
0.0 CHN_Tarim_EMBA1
0.0 CHN_Tarim_EMBA2
0.0 GEO_CHG
0.0 IRN_Ganj_Dareh_Historic
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Seh_Gabi_C
0.0 IRN_Seh_Gabi_LN
0.0 IRN_Tepe_Hissar_C
0.0 IRN_TepeHissar_C
0.0 KAZ_Botai
0.0 TJK_Sarazm_En

Target: RUS_Progress_En
Distance: 2.8598% / 0.02859770
64.2 RUS_Khvalynsk_En
18.8 TJK_Sarazm_En
17.0 GEO_CHG
0.0 CHN_Tarim_EMBA1
0.0 CHN_Tarim_EMBA2
0.0 IRN_Ganj_Dareh_Historic
0.0 IRN_Ganj_Dareh_N
0.0 IRN_Seh_Gabi_C
0.0 IRN_Seh_Gabi_LN
0.0 IRN_Tepe_Hissar_C
0.0 IRN_TepeHissar_C
0.0 KAZ_Botai
0.0 RUS_Karelia_HG
0.0 RUS_Samara_HG

An obvious method to get rid of this quandary is by not using either Steppe_En or Khvalysnk_En as sources for each other to figure out the source of the external ancestry. But the problem here is that Steppe_En cannot be modeled with EHG, and Khvalynsk_En as a source is a necessity (shown in point 2).

These are the 3 reasons why using Steppe_en as a source for Khvalynsk is problematic:

1. The Khvalynsk samples are older. Eg, Wilkin et al 2021 supplement states (emphasis mine)

"These ruminant grazers would not have been subject to the reservoir effect, with dates of 4450–4355 cal BCE from Khvalynsk I and 4448–4362 cal BCE from Khvalynsk II. These two dates provide the best estimate currently available for the true age of the two cemeteries."

Whereas the 3 Steppe_Eneolithic samples are dated in Wang et al 2019 supplement as -

PG2001: charcoal 4336-4173 cal BCE
PG2004: 4233-4047 cal BCE
VJ1001 with 2 dates (unknown where each came from, usually the younger one is without reservoir effect): Dating: 4332-4238 cal BCE & 4229–4065 cal BCE

Each of these three Steppe_En dates is younger than the two secure dates at Khvalynsk I & II.

2. There are no successful models for Steppe_Eneolithic after removing Khvalynsk_En from the list of possible sources. No 4 or even 5-source models with EHG as source instead of Khvalynsk_En provide p-values > pcutoff. The result of all 4 and 5-source models without khvalynsk are pasted in this workbook as extra sheets.

3. Golubaya Krinitsa (from Allentoft et al preprint) is apparently similar to Khvalynsk samples (25% 'CHG' similar to Khvalynsk) and is dated to 5300BCE, much earlier than Steppe_en.

The above issues don't arise when using Sarazm as a source (even though it is younger, because we have already checked that there can only be a one way flow between Sarazm and steppe populations. This is due to lack of EHG while modeling Sarazm. This lack of EHG in Sarazm is accepted by qpAdm (as shown in post) and also by G25.

Target: TJK_Sarazm_En
Distance: 3.1721% / 0.03172062
43.0 IRN_Ganj_Dareh_N
22.0 IRN_Tepe_Hissar_C
16.4 CHN_Tarim_EMBA1
16.0 GEO_CHG
2.6 RUS_Progress_En
0.0 RUS_Karelia_HG
0.0 RUS_Khvalynsk_En
0.0 RUS_Samara_HG

0.0 CHN_Tarim_EMBA2
0.0 IRN_Seh_Gabi_C
0.0 IRN_Seh_Gabi_LN
0.0 IRN_TepeHissar_C
0.0 KAZ_Botai

vAsiSTha said...

To put it simply,

In qpAdm we cannot have a reference or source (in this case steppe_en) which has received geneflow from the target (in this case khvalynsk).

However we can indeed use such a population if the geneflow is from source to reference/target (and not from target to source/reference) (Harney et al 2021)

Therefore, modeling khvalynsk_en with steppe_en as a source breaks several rules.

mzp1 said...

@daniel,

Iranian is at the centre of the Silk road. If you read the most ancient Iranian literature, Shahname, the whole economic and political dominance of that civilization is based on controlling the silk road. Numerous times they mention how they have things from china, india, west asia etc

They say stuff like "The King was brought tribute, the best silk from china, indian swords, middle eastern whatever"

And placing Iranian in the BMAC makes sense because Balto-Slavic and Germanic, and middle East all share Iranian linguistic and cultural things rather than Indo Aryan.

Legend said...

"But the problem here is that Steppe_En cannot be modeled with EHG."

Actually, the problem is in Caucasus-related source. Lazaridis et al. has categorically discussed this.

Examining individuals from the steppe (Fig. 3), we observe that in the post–5000 BCE period, Caucasus-related ancestry is added to the previous Eastern hunter-gatherer population, forming the Eneolithic populations at Khvalynsk (9) and Progress-2 (17); this ancestry persisted in the Steppe Maykop population of the 4th millennium BCE (17). However, all of these populations before ~3000 BCE lack any detectible Anatolian/Levantine–related ancestry, contrasting with all contemporaneous ones from the Southern Arc, which have at least some such ancestry at least since the Neolithic (11). In all later periods in the Southern Arc, Caucasus hunter-gatherer–related ancestry is never found by itself but rather is always admixed, to various degrees, with Anatolian/Levantine ancestry. This implies that the proximal source of the Caucasus-related ancestry in the Eneolithic steppe should be sought in an unsampled group that did not experience Anatolian/Levantine–related gene flow until the Eneolithic. Plausibly, this population existed in the North Caucasus, from which Caucasus hunter-gatherer–related, but not Anatolian/Levantine–related, ancestry could have entered the Eneolithic steppe.

"Each of these three Steppe_En dates is younger than the two secure dates at Khvalynsk I & II."

That's not a problem to be honest. Time of formation of their profile matters, those sample dates don't tell us which one formed first. We can find this out by running DATES.

If we set EHG & Iranian-related (CHG and GanjDareh samples pooled) as source populations and all three Khvalynsk samples as test population, we get a mean time of admixture of ~1100±180 years before Khvalynsk_EN lived. ( output:log )

And with all three Steppe_EN as test population we get a mean time of admixture of ~2820±1090 years before Steppe_EN lived, with same sources. ( output:log )

Above results tell us that Steppe_EN's profile is definitely older than that of Khvalynsk's. Khvalynsk postdates the Samara_HG sample and a similar kind of date is received from DATES. Even if I set Khvalynsk_EN & Iranian_related (both used as sources in your models for Steppe_EN) as source populations for Steppe_EN, I still get a mean time of admixture of ~2570±950 years before Steppe_EN lived.

"There are no successful models for Steppe_Eneolithic after removing Khvalynsk_En from the list of possible sources."

Yeah, we gotta wait for better Caucasus-related samples from North Caucasus as Lazaridis et al. suggests. Sarazm is not the real solution.

"Golubaya Krinitsa (from Allentoft et al preprint) is apparently similar to Khvalynsk samples (25% 'CHG' similar to Khvalynsk) and is dated to 5300BCE, much earlier than Steppe_en."

I've already explained why this is not a problem. Although I don't have access to samples of Golubaya_Krinitsa in qpAdm, I have G25 coordinates of two samples. One of them has very less CHG around 5-10%.

"In qpAdm we cannot have a reference or source (in this case steppe_en) which has received geneflow from the target (in this case khvalynsk)."

Yeah, I know about this rule but it's not being broken due to aforementioned reasons.

vAsiSTha said...

"Actually, the problem is in Caucasus-related source. Lazaridis et al. has categorically discussed this."

No, that is not the problem. The models work with Khvalynsk_En but not with EHG (have already shown this, and rotating result file is provided).

" we get a mean time of admixture of ~1100±180 years...
for Steppe_EN, I still get a mean time of admixture of ~2570±950 years before Steppe_EN lived."

Without getting into if the Dates calc is correct or not.

At 95% confidence, that formation date of 5140 to 5860BCE for Khvalynsk, a nice tight range.

4920 to 8720BCE for Steppe_en, a very wide almost useless range.

"Above results tell us that Steppe_EN's profile is definitely older than that of Khvalynsk's."

Nothing in this calculation tells you that steppe_en is 'definitely older'. The lower range of steppe_en (4920BCE) is younger than that of Khvalynsk (5140BCE). In fact, the wide std error makes the output useless.

"Yeah, we gotta wait for better Caucasus-related samples from North Caucasus as Lazaridis et al. suggests. Sarazm is not the real solution."

Sarazm is not the real solution is already known because of its date (3600bce). The fact is that khvalynsk and steppe_en require an iranN + ANE ancestry that can only be provided from the east, and not from caucasus. So the answer obviously lies in central and SC asia.

vAsiSTha said...

While we are at the topic of DATES, which imo gives variable results based on sources used, Chintalapati et al has their whole paper based on DATES. https://doi.org/10.7554/eLife.77625

"To understand the timing of the formation of the early Steppe pastoralist-related groups, we applied DATES using pooled EHG-related and pooled Iranian Neolithic farmer-related individuals. Focusing on the groups with the largest sample sizes, Yamnaya Samara (n=10) and Afanasievo (n=19), we inferred the admixture occurred between 40 and 45 generations before the individuals lived, translating to an admixture timing of ~4100 BCE (Supplementary file 1B). We obtained qualitatively similar dates across four Yamnaya and one Afanasievo groups, consistent with the findings that these groups descend from a recent common ancestor (we note for the Ozera samples from Ukraine, the dates were not significant). This is also further supported by the insight that the genetic differentiation across early Steppe pastoralist groups is very low (FST ~ 0.000–0.006) (Supplementary file 2H). Thus, we combined all early Steppe pastoralist individuals in one group to obtain a more precise estimate for the genetic formation of proto-Yamnaya of ~4400–4000 BCE"

Their paper does not give such old admixture dates for the EHG + Iran admixture. Clearly bulk of the admixture occurred post 4500BCE. Note that they have been able to reduce the 95% (2 std dev) range to just 400yrs, making it more reliable.

The above result (from the people who developed DATES) is agreeing with what I have written in this post. 1st wave of IranN into Khvalynsk (and Gobulaya Krinitsa). Second wave occurs much later, into steppe eneolithic. Average admixture date of the two admixtures thus falls between 4400-4000BCE.

vAsiSTha said...

Finally, it must be noted that Lazaridis et al could not find a single successful model for Steppe_En. I have been searching for successful models for 3 years now, and I claim bluntly that no reasonable rotating model will work without Sarazm ancestry in Steppe_en (from all currently available samples)

Lazaridis state:

"Neither the Steppe Maykop nor the Eneolithic of the piedmont of the North Caucasus (Progress-2) fit the 3-way model. An examination of the outlier f4-statistics of the model (“dscore” lines of qpAdm output) indicates that the model underestimates shared genetic drift with both the Levantine PPN
Z=-5.6) and the Siberian Upper Paleolithic represented by AfontovaGora3 in the outgroup set (Z=-3.4) for the Steppe Maykop population; the same is also true for the Piedmont Eneolithic (Z=-3.8 for AfontovaGora3). This suggests the presence of “Siberian” ancestry in the Eneolithic steppe, as previously observed.(17)"

Ok, so its clear to them too that some ANE is required. Did this ANE come from the caucasus to the south? or from its east? Of course the east makes most sense.

Furthermore, if one removes Sarazm from rotating source list, it doesn't mean that EHG + CHG + ANE will work for steppe_en. I guarantee that it wont with a credible setup, I have tried many times. Only after 3 years I have finally come to the conclusion that SC asian region is the only possible solution to this problem, a region that the authors of Lazaridis et al won't even deign to consider.

Daniel de França MTd2 said...

@mzp1 and @vasistha I wonder what type of evidence could be used to a spread of IE languages through the silk road. The number 7 and the symbol for the spoke wheel is found in the Indus seals. But the tradition we know from Indus signs, with the typical written symbols, unicorn, etc, begins at 2300BC, 300 years or so after the beginning of mature Harappa. It is contemporary with the raise of Akkadians in Sumer. So, perhaps a strengthening of Semito IE alience of sorts in dominating the silk road?

What are these things that related Germanic to Iranian but not Indic?

3rdacc said...

iirc, weren't there samples from Khvalynsk with extra WSHG? Some clade of haplogroup Q was reported too I think.

vAsiSTha said...

Yes multiple Q y haplos have been found at khvalynsk

3rdacc said...

interesting, could they be some siberians who tagged along with a Sarazm like population which settled in the steppe? I don't know about SCA Y haplogroups much.

vAsiSTha said...

Central asian, also present in few SC asian samples. J1a2 at Khvalynsk is from SC asia. I think that we will find some sort Sarazm-ANE cline in kazakhstan (at least in south and central) between 4000-3000bce.

3rdacc said...

> Central asian, also present in few SC asian samples.

oh ok didn't know Q came down that south.

However the haplogroups bring up the issue of R1b and R1a. How did eastern european paternal groups dominate compared to SCA haplogroups. And is R1b-M269 eastern european or not?

vAsiSTha said...

R1b-m269 is eastern European yes.

vAsiSTha said...

"How did eastern european paternal groups dominate compared to SCA haplogroups."

Lazaridis says here https://twitter.com/iosif_lazaridis/status/1563953743535685637?s=20&t=4nk3r-fZh-qpqe44cC807g

"he simple CHG-EHG model gives a CHG estimate of:

51.9+/- 1.3% (autosomes)
34.2+/- 8.5% (chrX)

In other words, the evidence is (2.1 s.e.) in favor of male CHG bias and _not_ the opposite"

We we can assume that between 5000 & 3100BCE (formation of yamnaya), R1b somehow took over.

Legend said...

"The models work with Khvalynsk_En but not with EHG"

But in models with Khvalynsk, they utilize a Southern source with significant Anatolian/Levantine ancestry. This kind of ancestry is also rejected by Lazaridis for Steppe_EN, is it not? I'm confused here.

"4920 to 8720BCE for Steppe_en, a very wide almost useless range."

Why is it useless? Narasimhan et al. gave a similar range for Iran_N + AASI admixture event in Indus_Periphery samples. I can probably narrow the range down by adding more EHG samples...

"The fact is that khvalynsk and steppe_en require an iranN + ANE ancestry that can only be provided from the east, and not from caucasus. So the answer obviously lies in central and SC asia."

"However, what is clear is that ancestry with a clear affinity to Iranian/SC Asian populations entered this region (and not CHG related)."

Khvalynsk_EN = 79% EHG + 21% Sarazm, from your model. This means only 2% CHG input and 15% of Iran_N input into Khvalynsk, correct? This means we should see a greater Khvalynsk_EN-Iran_N affinity than Khvalynsk_EN-CHG affinity, right? But this isn't the case. f4-statistic output never shows a significant Z-score for geneflow/affinity between Khvalynsk_EN and Iran_N, but shows for Khvalynsk_EN and CHG.

Chimp.REF Russia_Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002061 -4.908 38136 39586 703656
Chimp.REF CHG Russia_Khvalynsk_Eneolithic Iran_GanjDareh_N -0.000203 -0.465 39443 39586 703656
Chimp.REF Iran_GanjDareh_N Russia_Khvalynsk_Eneolithic CHG 0.001858 4.266 39443 38136 703656

"Their paper does not give such old admixture dates for the EHG + Iran admixture."

Their paper doesn't even give admixture dates for Steppe_EN and other eneolithic samples, and guess what they don't even include CHG in their Iranian_related pool but samples like Aigyrzhal_BA which are a mix of various ancestries. EHG and CHG have to mix way earlier than 4400-4000 BCE for profiles like Golubaya Krinitsa, Khvalynsk, Middle Don and Steppe_EN to form.

Lazaridis even states in supplementary that by 4500 BCE Yamnaya (not Proto-Yamnaya) is already formed with the latest ancestry arrival being Southern Arc ancestry. So obviously EHG + CHG (proto-Yamnaya/Steppe_EN like) mixture has to be older, right?

"Yamnaya cluster individuals from Russia admixed 63.7±10.6 generations (1,785±297 years) prior to the time they lived using EHG or Eneolithic individuals from Russia as one source and Chalcolithic individuals from Armenia, Azerbaijan, Iran, and Turkey in the Southern Arc as the other; their average radiocarbon date is 2770BCE, placing admixture to the mid-5th millennium BCE."

"Lazaridis et al could not find a single successful model for Steppe_En."

Yes, this is true but Lazaridis believes the unsampled CHG rich ancestor and a distinct ANE wave can explain Steppe_EN formation better.

"Only after 3 years I have finally come to the conclusion that SC asian region is the only possible solution to this problem, a region that the authors of Lazaridis et al won't even deign to consider."

Fair, I suppose better sampling in both North Caucasus and SC Asian region can help us conclude this issue.

Legend said...

From the f4 results I'm quite certain that the Sarazm model passes due to ANE presence, Iran_N doesn't really mean much for it as it prefers CHG in f4. You can see the same in G25.

Target: RUS_Khvalynsk_En
Distance: 3.4709% / 0.03470901
70.6 EHG
20.8 CHG
7.6 WSHG
1.0 TUR_Marmara_Barcin_N
0.0 IRN_Ganj_Dareh_N
0.0 Levant_PPN
0.0 SRB_Iron_Gates_HG

Target: Russia_Don_Neolithic_Golubaya_Krinitsa_5000BC:I12491
Distance: 2.6899% / 0.02689914
60.2 EHG
21.0 CHG
16.0 SRB_Iron_Gates_HG
2.8 TUR_Marmara_Barcin_N
0.0 IRN_Ganj_Dareh_N
0.0 Levant_PPN
0.0 WSHG

Target: Russia_Don_Neolithic_Golubaya_Krinitsa_5000BC:I12490
Distance: 2.3971% / 0.02397121
59.6 EHG
30.2 SRB_Iron_Gates_HG
5.8 CHG
3.2 WSHG
1.2 TUR_Marmara_Barcin_N
0.0 IRN_Ganj_Dareh_N
0.0 Levant_PPN

CHG's case is also stronger because Samara_HG has some CHG in it probably from North Caucasus CHG-like people.

"Finally, 3 Eneolithic individuals from Khvalynsk II in the Samara region of Russia(9) dated to 5000-4500BCE, a territory in which Yamnaya culture would later appear can be modeled on average as having mostly EHG ancestry, but also 21.5±1.7% CHG ancestry. These individuals postdate an EHG hunter-gatherer from Samara (Lebyazhinka IV)(8) by a thousand years. When we model this hunter-gatherer as a mixture of CHG and the two remaining EHG hunter-gatherers from Karelia(8), the resulting model fits (p=0.08) and assigns a non-significant 3.7±2.4% proportion of CHG ancestry."

This also explains the presence of J1 haplogroups in other EHGs.

Legend said...

The Karelia_HG + CHG model for Samara_HG even works on G25. Here too, CHG is preferred over Iran_N.

Target: RUS_Samara_HG
Distance: 2.7997% / 0.02799681
93.8 RUS_Karelia_HG
6.2 GEO_CHG
0.0 IRN_Ganj_Dareh_N

Legend said...

This f4 statistics run explains things very nicely.

result: Chimp.REF WSHG CHG Russia_Khvalynsk_Eneolithic 0.012597 24.075 37601 30009 602680
result: Chimp.REF CHG WSHG Russia_Khvalynsk_Eneolithic 0.002363 5.140 31433 30009 602680
result: Chimp.REF Russia_Khvalynsk_Eneolithic WSHG CHG -0.010234 -19.068 31433 37601 602680
result: Chimp.REF WSHG Iran_GanjDareh_N Russia_Khvalynsk_Eneolithic 0.013782 29.049 38152 29915 597714
result: Chimp.REF Iran_GanjDareh_N WSHG Russia_Khvalynsk_Eneolithic 0.001259 3.212 30667 29915 597714
result: Chimp.REF Russia_Khvalynsk_Eneolithic WSHG Iran_GanjDareh_N -0.012523 -25.028 30667 38152 597714
result: Chimp.REF Russia_Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002061 -4.908 38136 39586 703656
result: Chimp.REF CHG Russia_Khvalynsk_Eneolithic Iran_GanjDareh_N -0.000203 -0.465 39443 39586 703656
result: Chimp.REF Iran_GanjDareh_N Russia_Khvalynsk_Eneolithic CHG 0.001858 4.266 39443 38136 703656

As you can see, CHG and WSHG/ANE are much more important than Iran_N for Khvalynsk_EN.

vAsiSTha said...

Yeah well, I myself thought till that Khvalynsk had CHG before I did the analysis for this post. But sadly, no qpAdm models with CHG pass. Even if you remove Sarazm from sources, only the models with Seh_Gabi_C pass. This goes to show that G25 cannot be relied upon (this was already known, its an informal tool and coordinates have been developed based on an unknown and unpublished set of references - prone to bias). So when rebutting me, kindly use qpAdm (rotating model setup), not G25.

The only label which needs excess CHG is Steppe_eneolithic which is just north of the Caucasus range and is quite close to the country of Georgia where CHG was found.

As far as the J1 is concerned, the J1 at khvalynsk is actually J1a2 (j-CTS1026 as per anthony et al 2022). the J1 in Karelia is irrelevant as karelia_HG doesnt have any iran related ancestry, and neither is the khvalynsk J1a2 a descendent of the Karelia J1.

The Karelia example also goes to show how Y haplogroups can travel without any detectable autosomal impact.

vAsiSTha said...


@Legend

"result: Chimp.REF WSHG CHG Russia_Khvalynsk_Eneolithic 0.012597 24.075 37601 30009 602680"

Stop using F4 like this, it doesnt make any sense. WSHG has natural high affinity to the EHG in khvalynsk. after all EHG is ~50% ANE.

"As you can see, CHG and WSHG/ANE are much more important than Iran_N for Khvalynsk_EN."

Lol. You should really try to understand what those F4 mean. There are multiple reasons for high affinity, mostly to do with deep ancestry. I think now you are trying to use the 'Baffle with bullshit' strategy.

"result: Chimp.REF Russia_Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002061 -4.908 38136 39586 703656"

This shows higher affinity beteen CHG and Khvakynsk than between IranN & khvalynsk. Before you go declaring your victory, know that

f4(Chimp, EHG, CHG, IranN) is also significantly negative.
This again has to do with deep ancestry in CHG and EHG, not because there is CHG ancestry in EHG.

See this
Mbuti.DG EHG CHG Iran_GanjDareh_N -0.002144 0.000368 -5.834
Mbuti.DG Khvalynsk_Eneolithic CHG Iran_GanjDareh_N -0.002040 0.000381 -5.348


The Laziridis southern arc paper makes the CHG/IranN point very clear. So please don't quote that paper to claim that only CHG works but IranN doesn't. They don't even try to model with IranN for most of the labels.

"These five sources should not be unduly emphasized beyond their utility as a descriptive convenience because (i) they could be swapped for related ones [e.g., Neolithic Iran captures much of the same deep ancestry as Caucasus hunter-gatherers do"

The only way to prove your case is to model Khvalynsk with CHG in a credible rotating qpAdm setup. qpAdm deals with multiple F4 stats to arrive at a conclusion about the model provided. Do that, don't try to baffle me with BS.

vAsiSTha said...


F3 admixture test for Khvalynsk_EN. All 4 sources X=CHG/IranN/SehGabiC/Sarazm give significant -ve F3 stats for F3(EHG,X;Khvalynsk)

Results here. Also added it to blogpost.

Of course, qpAdm narrows it down to SehGabiC and Sarazm. CHG and IranN are rejected in qpAdm.

Legend said...

Ohh wait, I didn't think it like this. Why didn't they use Iran_N? Lol this makes no sense. Iran_N has to be there in models. Using CHG alone is just stupid.

"WSHG has natural high affinity to the EHG in khvalynsk. after all EHG is ~50% ANE."

Can it not differentiate between EHG and WSHG? Yeah they have ANE but the genetic drift must be massive, no?

"The only way to prove your case is to model Khvalynsk with CHG in a credible rotating qpAdm setup"

The issue is, it runs 30,000 models with that rotating list. Is there any way to fix number of sources to 2/3/4?

vAsiSTha said...

"Can it not differentiate between EHG and WSHG? Yeah they have ANE but the genetic drift must be massive, no?"

result: Chimp.REF WSHG CHG Russia_Khvalynsk_Eneolithic 0.012597 24.075 37601 30009 602680

Just look at the massive Z-score of 24. You get it because EHG is formed from 50% ANE & WSHG is also almost fully descended from ANE. F4(chimp, wshg, chg, khvalynsk) still shows massive shared drift between wshg and EHG, even after non shared drift.

F4 cant differentiate, thats what qpAdm is for, to balance all F4 stats so that there is no overshooting in either direction with respect to the references.

"The issue is, it runs 30,000 models with that rotating list. Is there any way to fix number of sources to 2/3/4?"

Just run them with details=false, much faster, get only the p-values. Later you can run the successful models only with details =true.

Freakk said...

This assumes that the chg and iranN that mixed with ehg were all J2 or some other non R1 haplogroup.

Considering that we have found that iranjans and Kurds have many basal r1 clades and their tmrca is very old predating sintahsta or yamanay,it could be possible that those chg and iranN themselves had some subclades of r1a and r1b.
And it makes sense considering r1a and r1b is from ANE and both chg and iranN is themselves 50% ANE.
Absence of evidence is not evidence of absence

vAsiSTha said...

@Freakk

"Considering that we have found that iranjans and Kurds have many basal r1 clades"

Which basal R1 clades in iranians and Kurds?