Sunday, December 18, 2022

Steppe Ancestry definitely arrived in India post 1000 BCE - The Final Blow


Karna
कर्ण (karNa), by @zdrava



In my first post on this topic - 'The True source of steppe ancestry in modern Indians' - I laid out the genetic claim that the best source for steppe in Indians seems to be a sample from the Yaz II Iron Age culture site at Takhirbai 3, Turkmenistan (short name of the sample - TKM_IA), dated to approximately 850 BCE. Based on other archaeological, anthropological, epigraphic and literary evidence which also supported this claim, I proposed that only a post-1000 BCE movement of these 'proto-Śāka' people could explain the steppe ancestry present in modern Indians. The main reason to reject the view that steppe folks came to the core Vedic region (of Punjab, Haryana and East UP) around ~1500 BCE, and gave Indians the Vedic culture and Indo-European languages is the fact that absolutely no archaeological evidence exists to support such a big claim. This difference is important - a post-1000 BCE movement of Iranian-speaking proto-Śāka into India cannot bring the Vedic culture and language. Neither does it match the evidence from Rig Veda, which describes a pre-Iron Age life. Do read that article to understand all the evidence in support of my claim. 


The finding of TKM_IA as the steppe source for Indians is an improvement upon the conclusion made by Narasimhan et al, 2019 - that the Central Steppe bronze age people are the source of Steppe ancestry in Indians. Since it is not likely that such an ancestry reached Indians in an undiluted form (there was a large presence of people with BMAC ancestry between the steppe and India), TKM_IA being a 57/43 mixture of Steppe Andronovo/BMAC ancestry (Guarino-Vignon et al, 2022) makes a lot more sense as the proximal mediator of that ancestry into Indians, especially in the absence of compelling archaeological evidence of Andronovo culture reaching India.

The article received a rebuttal from a Twitter commentator @The_Equationist, who claimed that one high steppe % outlier sample from ~920 BCE Loebanr (aka Loebanr_o) among ~100 Swat valley, Pakistan Iron age samples represented the people who were the true steppe source in Indians. He also claimed that such ancestry was already present in Indians deep inside the Vedic heartland of Punjab/Haryana by 1000 BCE and this woman was a migrant from deeper inside India into Swat. I wrote the second article 'The true source of the steppe ancestry in modern Indians (continued)' as a response to that supposed rebuttal. While it is not a good habit to model people using ancient outlier samples, Loebanr_o indeed is a better fit for modern Indians. However, I found that Loebanr_o itself had 50% ancestry from TKM_IA, and presented us with the most proximal link of the steppe ancestry source in Indians via 

Steppe_MLBA (1900 BCE) > TKM_IA (~1100-900 BCE) > Loebanr_o (~1000BCE) > Modern Indians. 

I also presented the case as to why Loebanr_o represents one of the earliest movements of such people into India from the north.

Not willing to give up, @The_Equationist has posted a rebuttal to my second post, doubling down on the stance that 920 BCE Loebanr_o was a migrant from an already steppe-rich Punjab/Haryana. He believes that a steppe-rich people established the Vedic culture in Punjab/Haryana ~1500 BCE. What follows from here is his claim and my final rebuttal. I write these posts (and not merely rebut on Twitter itself), because

1. I want this content and research not to get lost, to be ready for quick reference; and

2. I want my hypothesis to be bulletproof, and any rebuttal is welcome because I don't get many of those even after advertising for them to the opposing camp.


Let's get to the meat of the matter


His Twitter response is available here. The meat of his claim is pasted below.


Ummon's rebuttal





To simplify it for readers, he is saying that the tool DATES finds the admixture between two ancestry components (in this case, Steppe and a local Indian source) in Loebanr_o to have happened at the latest by 13th century BCE. So, he claims that rich steppe admixture was already present in India by 1300 BCE and such a migrant came to, died and got buried at the Loebanr site in Swat valley around 920 BCE. I note that no proof is provided for the claim that the source of this Loebanr_o ancestry is the Punjab/Haryana region. In my previous post, I noted that Loebanr_o has mtDNA Hg T1a1, which means that she has direct maternal descent from the steppe. T1a1 is a steppe marker, absent in India/Iran/BMAC/SC Asia in the bronze age but present in appreciable frequency at Sintashta ~1900 BCE. Also, her autosomal ancestry can be modelled simply as Loebanr_IA + TKM_IA (My previous article). Both these components are from the north of Punjab/Haryana. The extra AASI (native Indian deep ancestry very distantly related to the Onge Andamanese) present in modern Haryanvi, Punjabi and all Indians is absent in Loebanr_o. The only supporting evidence Ummon has is his own pet hypothesis that steppe ancestry must be present in Haryana by 1500 BCE. In my previous post, I asked him to provide archaeological evidence for such a claim, which he has not yet provided.



Now, coming to the problems in his DATES analysis:



1. DATES is not capable of finding the admixture timing in case of multiple admixture events in the target. It assumes a one-shot admixture. From the supplement -

We note that we currently do not model multiple admixture events in DATES and this in principle leads to confounding of the admixture timing. To mitigate the effects of the confounding, we focus on the most recent admixture event only, for which we have the highest power and minimal confounding.
Loebanr_o is admixed between Loebanr_IA and TKM_IA. Loebanr_IA itself has a minor (15-20%) steppe admixture dating to ~1650 BCE.

So, if we provide SouthIndian + Steppe as sources in DATES for Loebanr_o as the target, the result will be biased towards older dates than the timing of the actual admixture into Loebanr_IA. This has to be kept in mind. The same problem does not apply to SPGT because we assume no existing steppe ancestry in SWAT (also confirmed by Aligrama_IA samples from swat, all 3 have minimal to no steppe).

Although the three individuals are possibly from three of the 35 protohistoric graves of Aligrama excavated in autumn 1981, these individuals are not grouped with the individuals from the other Swat Protohistoric Grave sites that are assigned the SPGT analysis label, because we empirically observe that they have less Steppe pastoralist-related ancestry than the SPGT. (Narasimhan et al 2019 supplement)

2. He has again provided absolutely no details of his DATES parameters and samples used. In this thread, he mentioned that if asked he will provide it. I am not going to ask for it. Any half-serious work should provide all the details needed to replicate the results, especially after I pointed out the same issue in his previous attempt at rebuttal. 

3. He provides an extremely noisy scatter plot as the DATES output which does not seem statistically significant by any stretch of the imagination.

4. It is not possible anyway to produce a statistically significant DATES output for a single target sample. You can also see it in the noisy scatter plot on which an exponential decay function has been force-fitted by DATES. There just isn't enough power to accept the result as valid. In the next section, I will show you what a statistically significant DATES decay curve looks like.

5. He uses only 1 std error to calculate the range of dates for the admixture. That only provides 68% confidence. The academic standard is 2 Std Error for 95% confidence (giving a wider range but higher confidence of lying in that range) as used by Narasimhan et al, 2019 and Chintalapati et al, 2021. This fact is not genetics-related, but statistics related. It is well known that using Mean +- 1 Std error gives 68% confidence and that using Mean +- 2 Std Errors gives a 95% Confidence range, which means that the actual value lies in that range with 95% probability.




CALIBRATING DATES



The theory behind DATES is provided in the supplementary PDF of Narasimhan et al, 2019 and in Chintalapati et al 2022. I will try to explain it without math, in layman's terms. The older the admixture is in the target, the smaller the blocks of each ancestry are spread throughout the genome. Recent admixture causes larger chunks of the source population genomes to remain, but in a relatively lesser number of chunks. This means that there should be a higher correlation between two SNPs at a genetic distance of d centimorgans on a chromosome, than if the admixture was much older in time. In that case, the source population genome would have been split up into more number of smaller chunks with each passing generation.

In math terms, "We can compute a covariance coefficient of 𝑥(𝑖) and 𝑥(𝑗), where (i,j) are SNPs separated by genetic distance d. We expect this to decay exponentially with genetic distance, with a decay rate depending on the time since admixture (expressed as the number of generations since admixture)." (Narasimhan et al 2019 supplement. Read it to understand the math in detail). 

To test DATES, I ran the tool on the 80 Swat Valley Iron Age samples as the target. They are labelled as SPGT (same as in Narasimhan et al 2019).

Source 1a is chosen as Steppe_MLBA. This label contains 149 samples from various steppe-related bronze age sites. They represent the steppe ancestry.

Source 1b is chosen as Yamnaya. This contains 64 samples from various Yamnaya-related sites. Also represent an earlier EBA version of steppe ancestry.

Source 2 is labelled as SouthIndian.SG. It contains 216 modern samples from south India with very low steppe - including Telugu from the UK, Sri Lankan Tamils from the UK, Tamil Vellalar & Pulliyar.

Source 1a + Source 2 is the same setup as that of Narasimhan et al 2019. They use these 2 broad sources to find the date of admixture between Steppe and local ancestry in (the ancestors of) SPGT. Additonaly, I will also test Source 1b + Source 2. It should give similar results, as SPGT (and modern Indians as well) are ultimately an admixture between these 2 broad components.


SPGT DATES Results




SPGT Dates
DATES results for SPGT with a) Steppe_MLBA and b) Yamnaya


Summarized Results for SPGT



As done by Narasimhan et al 2019, the number of generations is multiplied by 28 years per generation to convert it into actual numerical dates. The average date of the SPGT cluster is taken as 919 BCE as also done by Narasimhan et al, 2019. To convert the output into actual 95% confidence interval dates, the formula used is

Older bound: [Avg Sample date - 28 x (Mean Gens + 2 x Std Error)]
Younger Bound: [Avg Sample date - 28 x (Mean Gens - 2 x Std Error)]

 

SPGT summary
Summarized Output for SPGT


The results of both runs match the results of Narasimhan et al 2019. Therefore, we shall use the same setup for further runs on other targets.


Note a few things:

1. The noise is extremely low, and the decay curve is a proper fit rather than a force fit within the noise. This is also reflected by the highly significant Z-score of ~12. Z-score is calculated as Mean/Std Error.

2. This is made possible by the high number of samples in all 3 labels. The main objective while using DATES is always to reduce the std errors. Even a low std error like 2.4 generations ends up in a 270-year range for 95% CI.

3. Compare the randomly scattered curve for Loebanr_o to these elegant, almost noiseless, decay curves. This is how we get statistical significance.


SPGT_o DATES results



We now come to the heart of the matter. SPGT_o is the label for Loebanr_o.
The results are below.

SPGT_o curve
Decay curves for SPGT_o. SouthIndian.SG with a) Steppe_MLBA b) Yamnaya


Additionally, I also test with SPGT as the local source.

SPGTo dates with spgt
Decay curve for SPGT_o with a) Steppe_MLBA and b) Yamnaya



Summarized Results of DATES for SPGT_o


SPGT_o Summary DATES
Summary of DATES results for SPGT_o


Few Notes:

1. The Z-scores are low, and the results are not statistically significant.

2. The decay scatter plot is very noisy, proved by the Z-scores. This is expected when the target is a single sample.

3. Regardless, the younger bound of the 95% CI range itself is very close to the sample date (1006, 850, 951, and 830 BCE). This does not support Ummon's claim of the 13th century BCE being the latest date of admixture.

4. DATES result given by Ummon (21.9 +- 8.8 has a Z-score of 2.49, ie. very low significance) was not replicated although my parameters and samples yielded almost identical results for SPGT as Narasimhan et al. He has not made his sample list and parameters public, so not my problem.


The Final Blow



I have edited out tweet 13 because it was not relevant to what I wanted to address. You may check out his tweet thread yourself, the link is given at the beginning of this post.

Here he doubles down on the high steppe presence in Haryana/Punjab latest by ~1500 BCE. He believes that these Haryanvis migrated to Swat, Loebanr_o was one example.

Ummon thread


Since he has not given any archaeological evidence to support his claim, he must have support from genetics right? No, he does not.


Narasimhan ALDER
Narasimhan et al 2019 show a late admixture date for most Punjab groups.



The published data shows a late admixture date for communities from Punjab. 95% CI admixture date is after 450 BCE. Narasimhan et al used ALDER, a tool with a similar purpose as DATES but it allows for 3 sources to be fed instead of 2 like DATES. The drawback is that the output given here is an average of the 2 admixture events between 3 sources - Steppe, IndusPeriphery and Onge. So an old admixture date between IndusPeriphery and Onge may pull the ALDER output towards showing an older date.

The west Eurasian admixture date for Rors (from Haryana, one of the highest steppe groups in India and Pakistan) as determined by Pathak et al, 2018 was 1500 years ago, ie ~500 CE. Although the Z-scores were only moderate, in the range of 4-6, as per Table S18 of Pathak et al. 

Pathak et al use ALDER and also show a very recent admixture date for Gujjars 15+-3 gens ago, but a very old date for Kamboj 120+-20 gens. One run also shows 80+-20 gens for Kamboj.

Something seems very off for the Gujjar result, and my own results from the next section don't match the Gujjar results of this paper.

DATES to study admixture timing in ancestors of modern Indians


I now have the opportunity to use DATES on the 44 samples from Pathak et al, 2018. Of these, 15 each are Ror and Gujjar, and 14 are Kamboj. I will also use DATES on all 44 together as the label 'NWIndian'.

I will also use the model to find admixture dates for 18 UP Brahmins, 10 are from Mondal et al 2016 (UBR.SG in Harvard dataset v54.1) and 8 from Metspalu et al 2011. I will also attempt it on 2 Tamil_Brahmins from Metspalu et al 2011.

Finally, I will use DATES on all these 64 samples under a common label 'IndianAll'.

Ror Kamboj Gujjar dates
Dates decay curve for Ror, Gujjar, Kamboj with Yamnaya & Steppe_MLBA


Tamil & UP Brahmin DATES
DATES decay curves for TN & UP Brahmins - with Steppe_MLBA and Yamnaya



Dated decay for NWindians and AllIndians
DATES decay curves for 44 NW Indians and 62 All Indians


DATES results summary for modern Indians


DATES summary for modern Indians
The reliable results indicate a post-1000BCE admixture of steppe ancestry


Addendum:


I further tested multiple other labels. The summarized outputs of those are presented below.

DATES on modern indians
Other targets also show late steppe admixture





Notes:

1. The noisiest and least significant results are for Tamil Brahmins. Without a doubt, this is because there are only 2 samples, so the power is low. Even so, it still has a slightly better Z-score of >3 than the SPGT_o models (Z-scores around 2 or below 2).

2. All the tests with 'Very High' Z-scores date the admixture event between ~1000 BCE and 0 CE. This is perfectly in line with all the articles that I have written so far. Rest assured, I had never run this test on modern Indians before and did not expect these to line up so perfectly with my views.

3. The presence of high steppe ancestry, and maybe even low ancestry, in NW India before 1000 BCE can be squarely rejected.

4. The dates also match the general finding of Moorjani et al 2013
Second, the date estimates are typically more recent in Indo-Europeans (average of 72 generations) compared to Dravidians (108 generations)
This paper also finds the admixture date for

1. UP Kshatriyas1 at 78 +- 9 gens

2. UP Kshatriyas2 at 76 +- 10 gens

3. Up Brahmins1 at 86 +- 7 gens

4. UP Brahmins2 at 65 +- 9 gens  

These dates are completely in line with mine, both the means and std errors are in a similar range. Priya Moorjani is the head of Moorjani Labs at UC Berkeley which has created these admixture dating tools such as DATES, ASCEND and Rolloff. So, it's heartening to see my conclusions matching hers.

 

5. The mean date of steppe ancestry entry into ancestors of modern Indians is around 600 BCE, and they were probably Iranian-speaking Śāka initially. During Achaemenid rule? Interestingly, Olivieri and Iori (2021) find a break in the SPGT culture post 800 BCE at Barikot and Aligrama and find 'local production of Iranic shapes'

The decline of this regional identity occurred in Swat during the so-called ‘period VIII’. Firstly, ‘period VIII’ can be now more confidently dated between 800 and 600 BCE. 
The question of the early ‘Indic influence’ at Charsadda I was postulated mostly on the basis of the presence of the ‘dishes with incurved sides’in grey ware and the ‘carinated bowls’, shallow bowls with carinated sides and S-shaped rim. Both forms, on the basis of the data from Swat (‘period VIII’), can be considered rather local forms, or local production of Iranic shapes, but certainly not Indic.
Pottery also vanishes between 700 and 500 BCE and Iranic elements appear

While almost no pottery evidence was recovered from the layers dated between the 7th and 6th century BCE, the rich ceramic material ascribable to the occupation phase dated from the early 5th to 4th century BCE represents a very distinctive ceramic assemblage which features elements of both Iranic and Indic origin. Interestingly, both fabric and temper suggest that the new pottery assemblage was largely produced in Swat. That means that shapes and functions might have been imported from outside Swat, but not the vessels (see Olivieri 2018b).
In my first post on this topic, I have given other evidence - the appearance of Yaz culture materials at Bannu between 900-600 BCE, and Panini's work (~500 BCE) mentioning towns with Iranian suffixes in Punjab.

Post this, the Achaemenid, Greek and Śāka rule in NW India is well known to students of history.



Some Caveats



There are some caveats which have been noted in Moorjani et al 2013, and I shall list them again here to provide the full picture. In my personal opinion, they need not be taken too seriously.

1. That it is possible for there to be two steppe inflows deeper into India. The second inflow overpowers the result and pushes the DATES output to be younger. The counter to this point is that all NW Indians and UP Brahmins get the same output in terms of admixture dating. 85 gens ago +- std errors. So different pulses don't seem to be related to caste status, and they seem to be from the same 'stock', so to say.

2. That steppe groups were present in India but did not admix with locals until the mid-Iron age. Hence we see younger admixture dates. This seems like a reach, and there is no archaeological evidence of this presence.




QUICK RESPONSE TO THE REST OF UMMON'S POINTS



3) - Yes, a superior intermediary source. But in 2019 after Narasimhan et al, Central_Steppe_MLBA was considered to be the direct intermediary. Though not wrong completely, that was not the direct intermediary. My previous post narrowed the proximal source down to TKM_IA, still a non-Indian sample. Then, with his help, we narrowed it down further to a population similar to Loebanr_o, at the doorstep of India!

In essence, we probably have figured out the whole chain of transmission of steppe ancestry into India

Steppe_MLBA > TKM_IA 'like' > Loebanr_o 'like' > Modern Indians

4) - No my theory does not hinge on Loebanr_o being a child of a TKM_IA 'like' mother and a Loebanr_IA 'like' father. Although, what I claim has a high likelihood to be true. Even if that weren't true, and the steppe admixture in this woman was from 1300 BCE, that still wouldn't place the source in Punjab/Haryana. My theory hinges on the fact that the steppe intermediary in Indians is only seen in the record post 950 BCE, which matches with the archaeological evidence of contact with the Iranians starting 900 BCE and continuing on till the Achaemenid, Greek, and Śāka periods.

She was a near-perfect 50/50 admixture of Loebanr_IA and TKM_IA. Such a profile is hardly possible 
if this ancestry persisted for a long time in India. At the very least, her ancestors did not stay at Loebanr for generations, because it would have diluted away and become like Loebanr_IA rather quickly (5 gens). Bear in mind that she was 1/101 samples, the other 100 were not steppe/TKM_IA enriched like her. I have already proved that steppe ancestry is a late entry into India so the deep Indian origin is impossible and also proved that his DATES result is noisy and not significant. Given these facts, the only possible solution is that she was a child of a 1st/2nd/3rd generation migrant from the north. I favour 1st because of the 50/50 admixture coefficients.


There is another reason to believe that Loebanr_o was not born in the same steppe impulse that gave birth to the Loebanr_IA cluster. From the PCA below (used in the previous post), the pink cline intersects the BMAC - Sintashta cline at a point which is higher in BMAC ancestry (maybe 70/30 BMAC/Sintashta). However, Loebanr_o falls on a completely different cline: The loebanr_IA - Loebanr_o - TKM_IA cline. This intersects the BMAC - Sintashta cline at TKM_IA, ie ~43/57 BMAC/Steppe. The higher steppe/bmac ratio is a clear indication of a later source.


PCA
PCA of Swat samples




Ummon thread 4


17) to 19) is not correct, there is no issue with using modern populations as references, but under the condition that the target does not give geneflow to the modern references. Either of Irula.DG or Roopkund_Pallanlike labels (Sample id I6942 and I6946) harbour 0 to a minuscule amount of steppe ancestry, which is why they were chosen as local pre-steppe sources in the first place. So Loebanr_o is not a source of their ancestry. Even geneflow from sources to reference is acceptable (Harney et al 2021).

Nevertheless, it is easy to prove that Loebanr_o = Loebanr_IA + TKM_IA is a valid model. Here is the qpAdm result after removing Irula and Roopkund_Pallanlike from the right populations. The p-value improves to 0.778


qpAdm model for Loebanr_o



CONCLUSION



The archaeological evidence, along with admixture dating of modern Indians strongly favours a post-1000 BCE movement of steppe-rich people moving into India. Literary and epigraphic evidence also point to increasing contact with the Iranians after 900 BCE, to be followed by Achaemenid rule and Indo-Śāka kings. Given the dearth of R1a-Z2124 steppe-related Y Hg frequency in eastern/central India, it is unlikely that this was heavily male-mediated. R1a-Z2124 is present comparatively in a higher proportion in NW India than the rest of India, however, it is not clear if this increase was due to later Islamic Turkic invasions (Pashtuns have a high % of R1a-Z2124). The Loebanr_o woman likely represents the first wave of such Iranians into India. It is for this reason that extremely few such samples are seen in the data. These Iranian migrations cannot be a source of Vedic language and culture in the Indian subcontinent.


MORE READING



There exists other evidence as to why steppe ancestry could not have carried Indo-Iranian languages.

1. Sintashta-related steppe ancestry is not present in Western Iran when it was supposed to have become Iranian speaking by 1000 BCE. Neither is that ancestry present in Syria/Iraq around 1600 BCE when the Mitanni/Kassites were invoking Vedic Gods and assuming Sanskritic names. Read.

2. Sintashta steppe-related R1a-Z2124 is present only in ~5% of modern Indians. Steppe R1a's impact on India is overstated. Indian R1a-L657 has not been found in any ancient sample from the steppe bronze age, and nor is it present in any appreciable frequency outside the Indian subcontinent. Read 'R1a Explained'
 
3. Part 1, Part 2 and this part prove that steppe ancestry entered the ancestors of modern Indians post-1000 BCE. This invalidates the steppe ancestry from being a vector of Indo-Aryan languages into India.

Then what explains the connection of Indo-Iranian languages with European, Anatolian and Armenian languages?

The region from Armenia to South/Central Asia around ~5000 BCE is turning out to be the best contender for the earliest Proto-Indo-European (PIE) region, although narrowing the region down needs more work. Read about the southern sources of the steppe ancestry which eventually spread IE languages to Europe. These southern sources likely carried IE to the steppe initially around 4500 BCE.




SUPPLEMENTARY FOLDER, TOOLS, METHODS



Results for all DATES models, parameter files, Sample list, and one qpAdm result file are available in this folder.

The executable and source code for DATES will be available on GitHub: https://github.com/MoorjaniLab/DATES_v4010 

The executable and source code for Admixtools 1 (for qpAdm) will be available on 

1240k dataset of ancient and modern individuals is available for download at

Tamil Brahmin and UP Brahmin samples are available at evolbio.ut.ee (Metspalu et al 2011). 504k SNPs were merged into the Harvard 1240k SNP dataset

NW Indian (Ror, Kamboj and Gujjar) are available at evolbio.ut.ee (Pathak et al 2018). 311k SNPs were merged into the Harvard 1240k SNP dataset.

Only the Harvard v54.1 1240k SNPs from the above two datasets were extracted (overlapping SNPs) using plink 2.0 and the 2 datasets were merged into the Harvard v54.1 dataset using plink 1.9.

Plink is available for download here https://zzz.bwh.harvard.edu/plink/plink2.shtml



REFERENCES



Ashish, K. (2022). R1a Explained. The Archaeogenetics Blog. https://a-genetics.blogspot.com/2022/10/r1a-explained.html

Ashish, K. (2022). The true source of the steppe ancestry in modern Indians. The Archaeogenetics Blog. https://a-genetics.blogspot.com/2022/12/steppe-source-in-indians.html

Ashish, K. (2022). The true source of the Steppe ancestry in modern Indians (continued). The Archaeogenetics Blog. https://a-genetics.blogspot.com/2022/12/steppe-source-indians-part2.html

Chintalapati, Manjusha, Nick Patterson, and Priya Moorjani. "The spatiotemporal patterns of major human admixture events during the European Holocene." Elife 11 (2022): e77625.

Harney, Eadaoin, et al. "Assessing the performance of qpAdm: a statistical tool for studying population admixture." Genetics 217.4 (2021): iyaa045.

Metspalu, Mait, et al. "Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia." The American Journal of Human Genetics 89.6 (2011): 731-744.

Mondal, Mayukh, et al. "Genomic analysis of Andamanese provides insights into ancient human migration into Asia and adaptation." Nature Genetics 48.9 (2016): 1066-1070.

Moorjani, Priya, et al. "Genetic evidence for recent population mixture in India." The American Journal of Human Genetics 93.3 (2013): 422-438.

Narasimhan, Vagheesh M., et al. "The formation of human populations in South and Central Asia." Science 365.6457 (2019): eaat7487.

Olivieria, Luca M., and Elisa Iori. "Patterns of Early Urbanisation in Swat: A Reassessment of the Data from the Recent Excavations at Barikot." Ancient Pakistan 32 (2021): 33-55.

Pathak, Ajai K., et al. "The genetic ancestry of modern Indus valley populations from northwest India." The American Journal of Human Genetics 103.6 (2018): 918-929.




29 comments:

Blue Caviar said...
This comment has been removed by the author.
vAsiSTha said...


Lol. I have my personal troll and hater, first to comment on everything. I live rent free in your head.

We do have Saka related Z2124, we do see Saka toponyms on land (recorded by Panini). We clearly have TKM_IA related ancestry.


"Also how come there is virtually 0 East Iranic substratum in the Vedas , but the Vedas still has a massive shared vocabulary with Avestan ?"

Because the Vedas and Avesta are much older than the post 1000bce Saka entry, duh.

You have nothing, prove my DATES models wrong if you can. Bye bye.

Blue Caviar said...
This comment has been removed by the author.
vAsiSTha said...

Mean: 0 std error 787 years. What is this? Ahahahahah

Z-score of 0. Ahahahaha

Orpheus said...

@vAsiSTha regarding steppe in Greece circa 2500BCE and IE. If the steppe source of these samples is indeed steppe EMBA-related and prefers a Yamnaya-related source over a CWC-related one (MLNBA, europe_LNBA etc) then they could have been proto-Greek speakers or Graeco-Phrygian/proto-Graeco-Phrygian (that later developed into Greek). I doubt they were balkanic or proto-balkanic speakers without any splitting in the language. They are pretty early though, which means the initial ~100% Yamnaya-related people and the language they spoke are even older. If the IE tree in Olander 2022 gets dated in the future, there's a chance that they would be too early for Greek or Graeco-Phrygian even if they were Yamnaya-related. That being said they explain the Greek-speaking Mycenaeans better than the alternate scenario of some Minoans/Helladics presumably importing steppe-rich wives.

This would also completely destroy the theory that Mycenaeans had anything to do with Catacomb though, since they would be a local development, absorbing the steppe from 2000-2500BCE which was already diluted. The near eastern (anatolian?) geneflow would have arrived later on, postdating the arrival of steppe ancestry.

It would be interesting to see some qpAdm models not just on the EBA samples but also on Mycenaeans uaing thise samples, and comparing them to the models from Clemente et al 2021 (supplementals). They would also have to be more or less close to the distal models of Lazaridis et al 2022 for Mycenaeans when it comes to their components.

vAsiSTha said...

Greece is very confusing to me now.
Another problem is that this catacomb ancestry is also seen in Georgia bronze/iron age (i forget). But Georgia doesn't have IE language, it's also much closer to catacomb.

Giacomo Benedetti said...

I think even if they are in Georgia, they could be Armenians, later replaced by Kartvelian speakers, or mixing with Kartvelian speakers. About Shakas, I think 600 BCE is too early, they are not mentioned in Indian sources at that time, they must have been Kambojas, already mentioned in Yaska's Nirukta, and you have also analyzed Kamboj people with ancient admixture. https://en.wikipedia.org/wiki/Kambojas

vAsiSTha said...

Hmm. From wiki

"The ancient Kambojas were likely of Indo-Iranian origin.[77] They are sometimes specifically described as Indo-Aryans[15][78][55] and sometimes as having both Indian and Iranian affinities.[79][80][81] The Kambojas are also described as a royal clan of the Sakas.[82]"

Saka affinity would be there. Similarly, Massegetae would not be Saka for Greeks, but maybe they were for Indians.

vAsiSTha said...

"I think even if they are in Georgia, they could be Armenians, later replaced by Kartvelian speakers, or mixing with Kartvelian speakers. " Possible.

Orpheus said...

Oh btw there are more reasons Sintashta or any CWC-related ancestry is unable to be I-Ir.

In Olander 2022 we see I-Ir splitting early before Sintashta even existed. On top of that it has no connection to CWC-related languages. This is pretty much a done deal in my book. I-Ir source is either a Yamnaya related one (this also explain the Greco-Aryan hypothesis even if the similarities are not enough to form an actual branch) or something unrelated with steppe ancestry altogether, presumably south of the Yamnaya, who could be only one group of PIE speakers among other PIE groups with no steppe ancestry, after PIE split from PIA (Indo-Anatolian).

vAsiSTha said...

Lol I don't trust their judgement. They say IIr has no direct relation with Balto Slavic (Kummel, Pronk). They also say IIr has archaisms which would make an early split possible (Kummel). They also say that IIr did not undergo the common neolithization of vocabulary like all Euro languages did, under the influence of Tryphillia (Kroonen et al).

But when you propose that Sintashta was not IIr because of all these reasons, they have 0 conviction. The herd mentality is too strong.

Freakk said...

Go back to 4chan and cry about soultreans and hyperboreans there.

Daily remainer,even lower caste indians have more ANE(hyperborean) Ancestry than europeans with most steppe Ancestry.


You will never be a hyperborean or Aryan.
Keep crying

Freakk said...

Reminder*

Freakk said...

I think indo-iranian split from the same common source as indo anatolian did(Iran_N/chg Without ehg Ancestry).
Then it travelled from Northwest iran/east anatolia to Afghanistan,central asian and pakistan region.

This would explain the old dating of the Vedas and many other issues.
And it's most likely info iranain developed in shar i shokhta,bamc and other nearby areas.

Didjjxkx said...
This comment has been removed by the author.
Giacomo Benedetti said...

You have cited from wiki: "On The Kambojas are also described as a royal clan of the Sakas." The source is this:

"French Indianist Alfred Foucher said that the Kohistan, a mountainous area near Kabul might be the land of the Kambojas, of which we know very little, except that they were more Iranian than Indian and raised fine horses. It seems from some inscriptions that they were a royal clan of the Sakas better known under the Greek name of Scyths.49

Historians tend to believe Kambojas were in fact an Iranian tribe. (Old Iranian and old Sanskrit are very close languages. All these people called themselves Aryan, from which comes the name Iran). Panini, the Indian genius of grammar, observed50 that the word kamboja meant at the same time the tribe and its king. Later historians identified the same word in the name of several great Persian kings, Cambyse (Greek version) or Kambujiya (in Persian)." The inscriptions mentioned are certainly late, of the Indo-Scythian period (150 BCE-400 CE). Sakas arrived in Bactria and South Asia only in the 2nd cent. BCE, pushed by the Yuezhi (https://en.wikipedia.org/wiki/Saka#Migrations). Kambojas instead are attested in Nirukta and Vamsha Brahmana (https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=VEI&page=1-138), in Pali canon (https://www.palikanon.com/english/pali_names/ka/kamboja.htm), and in Ashoka edicts. In epics and Puranas, Kambojas are often mentioned with Sakas and Yavanas, but distinguished.

Now, Ka(m)bujiya was the Persian name of Cambyses, the father of Cyrus the Great, the founder of the Persian empire. So, we can suppose that at least some Persians had a Kamboja origin, which could imply a steppe admixture, if we consider Kambojas an Iranic people coming from Central Asia during the first mill. BC, but before the Sakas.

vAsiSTha said...

"And do you really think a bunch of Iron Age Sakas explains the Steppe ancestry in India, then please explain why after 1500 BC there was a change in language and customs and culture in Northern India, "

What change in language lol? Do you some book form that time period?
What change in culture? Cite sources

vAsiSTha said...

@giacomo

Right, that would make Kamboja a good choice as a vector of this ancestry in Indians. Interestingly, if you see the DATES result for Kamboj, they have slightly older dates for the steppe admixture as compared to Ror and Gujjar.

But to defend the words that I have used, Saka or proto-saka for them, the ancestry that they brought was certainly the one which become ubiquitous in Sogdians and later Kangju people, and now continues in the Yagnobi.

Maybe the Greeks would not have used the word scythians for the Sogdians, but at least the Indians of the time considered the people from that region as Saka.

mzp1 said...

@Benetti,

Why do we need anyone to come from Steppe/Central Asia.

Iranian plateau and South Central Asia (Tajikistan, Nuristan etc) share a common geography and hence the Iran N component and also ancient IIr nomads like Kamboja and Sarmatians moved around this region.

Rigveda mentions the Saraya as West of the Indus

"The river is mentioned three times in the Rigveda. The banks of the Sarayu are the location of the slaying of two Aryas at the hands of Indra in RV 4.30.18. It is listed together with western tributaries to the Indus: Rasā, Anitabha, Kubha, Krumu and the Sindhu itself as obstacles crossed by the Maruts in RV 5.53.9. In this verse, Purisini appears as its epithet. At this stage of the earlier Rigveda, it apparently was a river west of the Indus system that corresponds to Iranian Harayu (Avestan acc. Harōiium, Old Persian Haraiva, modern Harē or Harī), the Hari River. It is invoked together with Sindhu and Sarasvati (two of the most prominent Rigvedic rivers) in the late hymn RV 10.64."

https://www.vyasaonline.com/encyclopedia/sarayu/

Sarayu may also be cognate of Sarmatian.

Then the Aryans were further West than just India, maybe around Helmand. From there the early Aryans migrated to UP and also BMAC region.

In later times we have people like Kambojas as a remainder of these Central Asian/Iranian plateau nomads.

Giacomo Benedetti said...

@vasistha

It is also interesting that a Kāmboja Aupamanyava was in a Brahmanical Vamsha, it suggests that Kambojas could also be accepted as Brahmins.
Kambojas in the epics and Pali canon are strictly associated with Yavanas, often in compound, and even their way of shaving the head appeared similar, since they are both described as muṇḍa. Sakas instead were known as half-shaven (https://www.sacred-texts.com/hin/vp/vp095.htm#fr_986).
It is interesting that Cambyses I was born around 600 BCE, so the time of the spread of Kambojas could be that, and it is also a possible date of Yaska's Nirukta.

Giacomo Benedetti said...

@mzp1

I accept that Iranic speakers were originally from Iran and SC Asia, and nomads were a mix of those southern Iranians and northern steppe people, who also moved to the south from the 2nd mill. BCE. So, I suggest that Kambojas were a central Asian tribe with steppe admixture who were during the 1st mill. BCE in Afghanistan and then moved to South Asia and at the same time to the west where we found their name among the Persians. BTW, when do you place the arrival of Aryans in India?

As to the Sarayu river, in 5.53.9 it is possible that first are mentioned the westernmost rivers, then the easternmost Sarayu, with "a reference to the north-east monsoon as
well as to the usual monsoon from the west", as Zimmer said (https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=VEI&page=2-433), since it speaks of the Maruts, the wind gods.
In 4.30.18, the two Aryas Arṇa and Citraratha can be Indo-Aryan kings pushed beyond the Sarayu: a Citraratha is in the eastern Anava genealogy, the dinasty of Anga beyond the Sarayu.

mzp1 said...

@Giacomo,

Well, its tough for me to try and explain what I think is happening but here is my effort..

PIE is very very ancient. Like pre-Neolithic. The most important aspect is that it is pre-Neolithic.

Before we see the West Asian and Anatolian Neolithic, before this culture existed, all these people were nomads of the IE type. IE Cows, Wheels, Horses, metallurgy, fire, hymn songs.

Its easy to understand if you know that Cow domestication pre-dates the Neolithic. Cows are domesticated around the Nile and Indus/Gujarat region before we see any sign of the Neolithic. This was done by IE Nomads prior to them becoming settled Farmers.

That means the whole region between Gujarat and Nile was occupied by IE Cow-Herders lets say around 15K BC.

Over time there are advancements in Animal and plant domestication. This allows those nomadic tribes to settle into farming.


The first place this happens is Iran, India and West Asia. In Iran over time populations become more and more settled and political power shifts from nomads to farmers. The nomads get pushed to the fringes of power and get pushed out to Central Asia, and remain in places where farming is less tenable like Sindh.

This takes us to around 5,000BC. Dasa are the farmers and Aryans are the nomads based around Sindh. Dasa farmers are in India and Iran.

From the hymns above where Sarayu is in the West, Book 4 Hymn 30
"14 Thou, Indra, also smotest down Kulitara's son Śambara,
The Dāsa, from the lofty hill."


Dasas are always associated with Hills and Mountains. Aryans are low-land river people.

IE is in two branches. Mountain Zeus Worship and low-land Indra/Thor worship. The former are the farmers ie Greeks, Armenians, (iranians??) and the latter are the most recent nomads.

Pops are going from Nomadic low-land to High-land farmers. The farmers were the most ancient nomads, the most recent nomads have not yet transitioned to farming, and they are genetically closer to Indian tribals due to tribal DNA moving North from Gujarat via people like Kutchi nomads and mixing with Sindh pops like Ror, who then mix with Nomads further North into Steppe. There is no genelfow from India tribals to farmers it is tribals to nomads to farmers. Hence Steppe DNA is closer to Onge than any West Asian farmer.

Thus in the Rigveda, dasa are always assocatiad with mountains and hills. In west Iran the Neolithic farming states were centred around hill-forts built on hills and mountains. Thus the Vedic Aryans talk of destroying those hill-forts were the Non-Aryans lived. These Dasa farming hill forts could be in Iran and even Western and Northern India, cos its the same technology and way of life.

So these tribes like Sarmatians and Kambojas, they are just remnants of Iranian Plateau and Central Asian nomads, who are distinct and further South than Steppe EBA/MBA. Iran was full of these people 15K BC before farming became the main political power.

mzp1 said...

In short, the original IE peoples were nomads of India and Iran. When farming became possible due to advances in Animals and Plant domestication, these regions transitioned to settle farming ie Civilizaton where possible. Those regions where farming was less tenable, ie dry, steppe, flat-lands like Sindh, Northern Central Asia was the last stronghold of this ancient IE nomadic culture.

The Rigveda is the last remnant of this ancient culture. These people exist in Southern parts of Punjab, Sindh and Eastern Iran. That region is a source of populations and places like Uttar Pradesh is a sink. Hence those nomads have to move from Sindh to Uttar Pradesh as Iran and Northern Punjad is already full of ancient dasa farmers.

So the last stronghold of the nomads in the South, is from eastern Iran ie Helmand region to Rajasthan. This is where the hymns of the rigveda place it. People are moving all over this region. It is just like one culture of Aryan Vedic nomads.

When this group finally takes up farming, it creates the Brahmanical system of Hinduism and the Zoraastrian system in BMAC. BMAC/Zoroastrianism and IVC/Hinduism both start at the same time and seem to be related to Aryan domination of South Central Asia, with the Nomadic Aryans using their military and trading power, and connections between Iran, South Asia, Central Asia, to create a new regional power that centralises regional power from the root which was Aryan Nomads that were geographically more connected than anyone else, coming from Eastern Afg and Northen India they could connect everyone better.

Abhi said...

If you have read Vedveer Arya’s work, the Shaka era was 583 bce, as opposed to 78 CE. This could explain Proto Scythian ancestry in India the time.

Piyush said...

@mzp1

"PIE is very very ancient.The most important aspect is that it is pre-Neolithic.
Before we see the West Asian and Anatolian Neolithic, before this culture existed, all these people were nomads of the IE type. IE Cows, Wheels, Horses, metallurgy, fire, hymn songs."


But don't the reconstructed PIE contains terms for agriculture ? How come PIE be pre-neolithic when it's supposed to have agricultural terms. Regarding metallurgy, again, evidence for metallurgy is much later ~4000-3500 BCE so why are you placing metallurgy before agriculture ?


"That means the whole region between Gujarat and Nile was occupied by IE Cow-Herders lets say around 15K BC"

Cow domestication pre-LGM ? any evidence for that ?


"Dasas are always associated with Hills and Mountains. Aryans are low-land river people."

Contradictory by your farmer-dasa and nomad-aryan description, no n? One would expect Dasas to be farming in plains in low lands.

"Thus in the Rigveda, dasa are always assocatiad with mountains and hills. In west Iran the Neolithic farming states were centred around hill-forts built on hills and mountains. "

But as the per textual analysis of some of the earliest mandalas of RgVeda by Talageri, it's said that the hymns were composed in the plains of punjab & haryana. So, where did these highland western iranian inhabiting dasyus come from in RgVeda ?


"The Rigveda is the last remnant of this ancient culture. These people exist in Southern parts of Punjab, Sindh and Eastern Iran.
Hence those nomads have to move from Sindh to Uttar Pradesh as Iran and Northern Punjad is already full of ancient dasa farmers.
That region is a source of populations and places like Uttar Pradesh is a sink."


One would expect pastoralist nomads to move along river paths but you seem to have them bypassing the indus river network and instead cross the Thar desert, Aravalli hills to reach UP ! What a bizarre and deadly path to take for those nomad pastoralists ! This doesn't make sense. Also, Gangetic plains region had perennial rivers, higher number of fauna and thus, perhaps higher population density (of hunter gatherers) than say sindh.


" last stronghold of the nomads in the South, is from eastern Iran ie Helmand region to Rajasthan. This is where the hymns of the rigveda place it."

IIRC, talageri based on textual analysis of RgVeda shows that the earliest mandalas were placed in punjab, haryana plains. It's in the later mandalas where we see them moving to eastern afghan region.


Sorry if this may seem harsh but your overall theory seems very incoherent and doesn't make sense.

Piyush said...

@Ashish, twitter user arya_amsha has modelled Loebnar_IA without BMAC and Katelai_IA with barely 2% BMAC here https://vritrahan.blogspot.com/2022/12/the-swat-protohistoric-samples.html

It seems that he is using Indus_Gonur as IVCp sources (remember there were 3 IVCp samples found in Gonur region). It seems that using these, one don't need BMAC for these Swat IA clusters. What do you think ?

Fluorophore123 said...

Ashish, Ummon Karpe/The Equationist has posted another rebuttal on Brown Pundits. I am curious to see your response. I have linked it below:

https://www.brownpundits.com/2023/01/11/showing-an-early-entry-of-steppe-ancestry-into-india/

Chaddest Maximus said...
This comment has been removed by the author.
Chaddest Maximus said...

Could these female Greek slaves during mauryan period contribute any steppe? Even I remember reading somewhere that chanakya discouraged marrying non Arya women I thought you might find this interesting
https://muse.jhu.edu/article/757117