abstract
Across West Asia, hundreds of ancient human remains have been DNA sequenced spanning a large chunk of time and space. In this study we use these ancient DNA signatures to shed light on the dispersal of Indo-European languages in West Asia.
Our latest genetics study paints a more complex population history for Western Iran than hitherto documented. Using formal statistical methods for bioniformatics analysis we find evidence of a multi-phase Indo-Europeanization process spanning hundreds of years, commencing as early as the 2nd millennium BCE as follows:
- 1st Indo-Europeanization phase; 4500 to 5000 years ago groups of proto-Indo-Europeans with male DNA haplogroup R-Z2103 ( R1b1a1b1b ) rapidly expanded over the steppes of Kazakhstan and Russia. Between 3000 & 3500 YBP Yamanaya related males invaded the Caucasus and started replacing the males with DNA haplogroup G, J, and L which were hitherto dominating Armenia and Northwest Iran (Fig 11). Many of the haplogroup R1b1a1b1b 3000 year old ancient remains from Armenia and NW Iran were positive for the R-Z2106 -> R-M12149 mutation. The 2800 year old remains from Tepe-Hasanlu, Iran were predominantly haplogroup R1b1a1b1b, whereas the 500 year old and older males from the same region were predominantly haplogroups G, J, and L, and none had haplogroup R1b1a1b1b. This phase led to the Indo-European language currently spoken by Armenians.
- 2nd Indo-Europeanization phase; waves of R1b male haplogroup Yamnaya related descending through the Caucasus and present-day Armenia, followed by R1a haplogroup late Iron-Age Indo-Iranians,which included Parthians, and culminating with mixed Indo-Iranian-Turkic peoples with the Seljuk Sultanate.
We also find genetics evidence of a 2nd millenium BCE migration to Western Iran from the Bactria Margiana Archeological Complex (BMAC) preceding the Indo-European invasion accompanied by male haplogroup R1b from the Caucasus and present-day Armenia. These events would shape the material culture, language, and genetics of the pre-R1a haplogroup Central Asian Indo-Iranians. We see evidence of this in the genomes of 2800 year old skeletons excavated from the region of Mannea at Tepe-Hasanlu in nothwestern Iran.
Later starting around the year 1050 CE various Turkic and Mongol tribes invaded present-day Iran, Iraq, and Turkey and ruled this area for the next 800 years. Via the demic diffusion model and hybridization with the locals, and via the Kurdification and Turkification process, these Turkic peoples left their legacy by transforming the genetic sub-structure of present-day West Asian populations such as Armenians, Kurds, Persians and Turks and others.
From modest beginnings from the Oghuz State, Seljuk’s son, Musa and his nephews Tughril and Chaghri would later establish one of the most successful and long lasting Perso-Turkic Sultanates (Figs 1-2), which spanned over the ancient Sassanian domains, in Iran and Iraq, and included Anatolia, Syria, as well as parts of Central Asia and modern Afghanistan.


introduction
ARMENIA, NW IRAN, AND E TURKEY DURING THE BRONZE AGE
Using the framework of qpAdm, we identified a 3 important key populations that had a very large impact on the demography of today’s Armenians, Persians, Kurds, and Turks. They are:
- 3500 year-old Mannean samples from NW Iran identified in Lazaridis et al2 as Iran_DinkhaTepe_BIA_A;
- 2800 year-old Uartian samples from E Turkey identified in Lazaridis et al2 as Turkey_E_Van_Urartian;
- 4000 year-old Minoan samples from Greece identified in Lazaridis et al2 as Greece_Minoan_Lassithi (primarily for Turkish populations)
Our analysis further indicates that Uartians were almost a 50/50 mix of 3500 year-old Manneans from NW Iran and 4500 year-old Kura-Araxes culture from present-day Armenia. As per Lazaridis et al2 the Kura-Araxes population was primarily descended from Caucasus Hunter Gatherer (~75%) and Anatolian Neolithic Farmer related populations (~25%).
The indo-iranians
Around 4000 years ago major changes were occurring in Central Asia that would later shape the culture, language, genetics and religion of Indic and Iranic populations from Central to West Asia. In this region encompassing the ancient Bactria-Margiana Archeological Complex (BMAC), the Indus-Valley (IVC) and Ariana, the Proto-Indo-Iranian Andronovo Steppe peoples (sometimes referred to as the “Aryans”) would merge with neighboring BMAC and IVC peoples to give birth to some of the ancestors of today’s Iranic and Indic peoples.
Around 3500 years ago the Iranian tribes emerged in the region of northwest Iran. These tribes expanded their control over large areas of western Iran. The significance of Iranian elements in these regions commenced around 2900 years ago.
Whereas the Andronovo and BMAC ancestors of today’s Iranics had a prolonged interaction, the same can not be said for the ancestors of today’s Indics. For some reason the ancestors of present-day had a significantly shorter period of interaction with the Oxus BMAC civilization. Instead, they split off from their early Indo-Iranian cousins and migrated towards the IVC region of Seven Rivers (Sanskrit: Sapta Sindhu, Pashto: Owa Sindhuna, Kurdi: Haft Sindh, Farsi: Haft Darya).
There the Rig-Vedic religion would diverge from their sister Iranic Zoroastrian religion. During that time, the Iranic Avesta and Indic Rig-Veda had a great overlap of religious deities and the Indic Sanskrit and Avestan Iranic languages were quite similar.
Our formal genetic analysis results point to a gradual transformation of the genetic substructure of western Iranians commencing about 3500 years ago by waves of invasive Indo-Iranians and Oghuz Turkic language speakers from the region around the Bactria-Margiana Archeological Complex (BMAC) in Central Asia. These invaders over 3000 years genetically, culturally, and linguistically transformed the Pre-Indo-Iranian Assyrian and Uartian sphere of influence that culminated with the genetic substructure of present-day Kurds, Azeris, Armenians, Persians, Turks and other ethnic groups of the region.
The results indicate this prolonged interaction produced early Indo-Iranians that genetically fall on a BMAC-Andronovo cline with the 2900 year old sample known as TKM-IA falling on one end (49% – 51% BMAC – Andronovo p-value 0.18) to others with a more diluted Steppe ancestry of 60% – 40% BMAC – Andronovo who hybridized with the ancient Assyrians or Manneans of western Iran to create a majority of the ancestors of today’s Kurds.
Analysis results point to early Indo-Iranians forming as a result of admixture between the Middle-Late Bronze Age Andronovo Eurasian Steppe herders and their BMAC neighbors. These early Iranians can be represented by the 2900 year old genome known as TKM-IA which had a Y-DNA of R1a-Z93. Our formal analysis indicates that TKM-IA is a 51% – 49% mix of BMAC (represented by the samples UZB_Sappali_Tepe_BA) and Andronovo, respectively.
These results are consistent with the other scientific studies indicating a prolonged interaction between the Andronovo Steppe peoples with their neighbors of the BMAC. This interaction would ultimately give rise to the Indo-Iranian ancestors of today’s multiple ethnic groups of Iran and Afghanistan, who included the Medes, Parthians, Scythians, Saka, Sarmatians and others, which we collectively call the “Indo-Iranians”.
The Indo-Iranians would in time migrate out of their homeland region around the BMAC and ancient “Ariana” and invade large areas in Western Iran and West Asia. In Western Iran the earlier waves of Indo-Iranians such as the Medes would put an end to Assyrian rule, and hybridize with existing pre-Indo-Iranian peoples such as Manneans and descendants of the Chalcolithic Zagrosian Farmers to introduce the predecessor languages1 to todays languages spoken in the region (Figs 3 – 5), as well as the Zoroastrian religion. We see evidence of these admixture events in the 2800 year old genomes recovered from Tepe-Hasanlu in Western Iran, where almost 50% of the individuals carry up to 6% Eastern Hunter Gatherer (EHG) admixture2, which was hitherto missing in the 3500 year old genomes from the Zagros region.
These admixture events would also introduce Indo-Iranian Y-DNA haplogroups R1a-Z93 & Z94 to all of Iran, India, and Afghanistan.



By utilizing the hundreds of ancient samples from the Anatolia, Armenia, Mesopotamia, and Persia regions, published in 2022 by Lazaridis, et al2, and our extensive analysis utilizing a framework of the formal statistical software qpWave and qpAdm we were able to determine with a high degree of certainty the genetic substructure of the early Indo-Iranian tribes known as the Medes and Parthians, as well as determine their genetic contributions to Iranic ethnic groups such as the Kurds and Persians.
THE TURKS
Around 1500 years ago Turkic speaking peoples Turkic peoples were gradually streaming into most of Central Asia from their original homeland in the Altai mountains and gradually displaced or assimilated the Iranic Saka Scythians who had ruled Central Asia for the previous 1000 years.
Turkic Karakhanids and mixed Perso-Turkic Ghaznavids occupied Khorasan and Transoxiana upon the collapse of the Persianate Samanid dynasty 1000 years ago. To the north around present-day Turkmenistan and bordering the Ghaznavids and Karakhanids was the Oghuz Yabgu State.
Starting around the year 1050 CE and for the next 800 or so years the region encompassing present-day Iran and Kurdistan would be ruled by various Turkic dynasties and Mongolic dynasties such as the Ilkhanids and Timurids (Figs 6 & 7). Thus it is not surprising that to see a genetic contribution of those groups in present-day Turks and Kurds.


From modest beginnings from the Oghuz State, Seljuk’s son, Musa and his nephews Tughril and Chaghri would later establish one of the most successful and long lasting Perso-Turkic Sultanates, which spanned over the ancient Sassanian domains, in Iran and Iraq, and included Anatolia, Syria, as well as parts of Central Asia and modern Afghanistan.
The success of the Seljuks can be attributed to fair treatment of the people they ruled over. Highly Persianized in culture with Persian as their second language, the Seljuks also played an important role in the development of the Turko-Persian tradition, even exporting Persian culture to Anatolia. Seljuks are remembered as great patrons of Persian culture, art, literature, and language.

The settlement of Turkic tribes in the NW of Iran and in Anatolia, for the strategic military purpose of fending off invasions from neighboring states, led to the progressive Turkicization of those areas.
Shortly thereafter around 1071 CE the Seljuks under Sultan Alp Arsalan along with 10,000 Kurdish horsemen in the Seljuk cavalry defeated the Byzantines in Eastern Anatolia in the decisive battle of Manzikert.
The region of Iraq was under the control of the Seljuk Empire from 1055 to 1135, since the Oghuz Turk Tughril Beg had expelled the Shiite Buyid dynasty.
By 1081, Turkish tribes controlled most of the Anatolian plateau. The Anatolian Seljuk Sultanate is also known as the “Sultanate of Rum.” Through a process of Elite Dominance by the Seljuks and Ottomen the Turkish language was adopted by the various ethnicities inhabiting Anatolia.
Around 1150 Seljuk Sultan Ahmad Sanjar united the Kurdish principalities into the province of “Kurdistan” which included the provinces of Sinjar, Shahrazur, Hamadan, Dinawar, and Kermanshah. Thereafter, the various tribes known falling under the Kurd umbrella migrated westward from Persia to present-day Iraq and Turkey. This is consistent with the historical and genetic evidence8 where we fail to find any ancient samples from Anatolia and Iraq that form clades with present-day Kurds pointing to no large settlements of Kurds in those areas previously.
discussion & methodology
Using the Reich Lab 1240K Affymetrix genotyped datasets and the qpWave & qpAdm suites of Admixtools we co-analyzed our study populations consisting of Europeans, West & South Asians (STUDY) against target Turkic and Iranic populations (TARGET). Using a framework of formal statistical software qpWave and qpAdm in the Admixtools suite, along with a robust set of 24 ancient diverse Eurasian reference populations (pright) to distinguish between slightly differing streams of ancestry we analyze for the following:
- Genetic similarity or cladeliness between the Study & Target populations using qpWave results calibrated for the differences in sequencing and genotyping methods used on the individual genomes;
- Genetic modeling using qpAdm
Our Kayseri Turkish, Southwestern Iranian Persian, and Armenian STUDY samples were originally published as part of the Simons Diversity Whole Genome Project. The Kurdish samples are from Iraqi Kurds. Considering that Kurds cover a very large geographical region from Turkmenistan west to Anatolia it is to be expected that there is considerable intra-population genetic diversity amongst them and thus the results from our Kurdish samples may not necessarily be representative of Kurds from Eastern Iran, Turkmenistan or Anatolia.
Similarly, our Kayseri Turkish samples may not be representative of Turks in Eastern and Western Turkey. Our study is limited by the samples available in the 1240K genotyped Reichlab dataset. Should additional 1240K genotyped Turkish , Kurdish, Persian, Armenian or Assyrian samples become available in the future, we will amend this study.
Kurdification & turkification
A complicating factor with the Kurdish samples is the “Kurdification” of many ethnic Armenians after converting to Islam9, while many ethnic Bulgarians10, Circassians11, Chechens12, Ingushs12, and Ossetians were Kurdified as a result of fleeing to the region and having subsequently assimilated to the Kurdish culture and language.
Numerous Turk & Turkmen tribes have been Kurdified, and various Kurdish tribes Turkified during the past few centuries in Iraq, Iran, and Turkey. A few examples include the following:
- Karapapakhs; In West Azerbaijan, many Karapapakhs were Kurdified3.
- Küresunni Turks; In the southwest of Khoy, there are Kurdicized groups of Küresünni Turks4.
- Tilku Tribe; A group of Kurdicized Tilku Turks live around Santeh and Zagheh of Saqqez County5.
- Qaramanlu and Silsepuranlu Kurd tribes in Eastern Iran are of Turkoman origins13.
Many Khorasani Kurds are bilingual in Khorasani Turkic, mainly due to intermarriages with Khorasani Turks and Turkmen and there are numerous mixed Kurd-Turk-Turkmen tribes both in Iraq and in Iran (Fig 8).
A 2013 study estimates that in Turkey there are 2,708,000 marriages between Turks and Kurds/Zaza14. If we assume that 50% of the 15 million Kurdish citizens of Turkey are of marriage age, namely, over 18 years old, this implies about 1/3 of the adult Kurds in Turkey are married to Turks. A 2002 report showed that most mixed marriages happened in large cities and areas where their own group formed a minority. In most mixed marriages, the men were Kurds while the women were Turks. Turks with lower level education were more open to marrying Kurds, while Kurds with higher level education were more open to marrying Turks15
In light of the aforementioned, and considering that the ancestors of today’s Kurds, Turks, and Turkmen have geographically overlapped each other, and hybridized with each other ever since Turks assimilated Saka in Central Asia 2000 years ago, it is hardly surprising that Kurds, Turks, and Turkmen, all share “Iranic” and “Turkic” ancestors, and there are large overlaps in phenotypes amongst the three (Fig 10). The formal genetics analysis in this study is very much consistent with this reality.












Fig 10 – Large diversity of phenotypes among Kurds reflecting geography and admixtures with neighboring ethnic groups
Genetic similarity / cladeliness using qpwave
QpWave in cladeliness mode can be used to test an extant West Asian STUDY population against an Iranic or Turkic TARGET Population in pleft for cladeliness by using f4 statistics on the pleft populations against the reference populations in pright. In this mode the program generates a “null hypothesis” p-value which corresponds to the likelihood of STUDY and TARGET forming a clade with each other. In the extreme case that the STUDY and TARGET are indistinguishable from each other a p-value of 1 and a chisq value of 0 are generated. By contrast, if STUDY and TARGET are outgroups to each other a p-value of 0 and a chisq of infinity are generated.
Utilizing the Harvard Reichlab “Southern Arc” dataset2, along with a diverse set of 24 Eurasian reference populations shown below (pright) to differentiate subtle differences in shared drift (pright), we perform the formal qpWave6 and qpAdm6 analysis. We use qpWave with 2 source populations (pleft) to either accept or reject a null hypothesis, the hypothesis being one extant STUDY population forms a very tight clade with one ancient TARGET population. We use a widely accepted p-value of 0.05 to accept or reject our null hypothesis (no difference between the observed and expected values), with 0.05 translating to a 5% probability the STUDY and TARGET populations form a tight genetic clade.
Our STUDY & TARGET populations (1240K SNPs) include:
Abkhasian | Kazakh | Huns-Kyrgizstan – 1500 BP |
Armenian | Komi-Izhma | Alan-1400 BP |
Balochi | Kurd 2-Iraq | IRAN-DinkhaTepe-A-3500 BP |
Bashkir | Kurd-Iraq | IRAN-Hasanlu-IA-2800 BP |
Bulgarian | Ossetian-North | Karakhanid-Kazakhstan-1000 BP |
Buryat | Pashtun-PAK | Sarmatian-Kazakhstan-2100 BP |
Estonian | Punjabi | Turk-Kazakhstan-1300 BP |
Georgian | Saami | Wusun-Kazakhstan-2200 BP |
Hungarian | Sardinian | Ghaznavid-Pakistan-820 BP |
Persian-SW | Tajik | Kushan-Tajikistan-2200 BP |
Jew_Iraq | Tatar-Volga | |
Jordanian | Turk-Kayseri | |
Turkmen-Turkmenistan |
Reference populations (pright):
Morocco_Iberomaurusian | Serbia_IronGates_Mesolithic | Russia_Karasuk_oRISE.SG | Russia_Afanasievo.DG |
Russia_MA1_HG.SG | Russia_Kolyma_M.SG | China_YR_LN | Russia_KolymaRiver_LN.SG |
Kenya_EarlyPastoralN | Russia_Shamanka_Eneol.SG | Mongolia_LBA_Khovsgol_6 | Russia_Buryatia_Xiongnu |
TUR_Marmara_Barcın_N | Russia_DevilsCave_N.SG | Mongolia_EIA_Sagly_4 | Russia_Siberia_Lena_EBA |
Russia_HG_Tyumen | Mongolia_EBA_Chemurchek_2 | Russia_MLBA_Krasnoyarsk | Iran_HajjiFiruz_C |
Georgia_Kotias.SG | Ukraine_N | Papuan.SDG | Russia_HG_Karelia |
CALIBRATION OF GENOMES TO ACCOUNT FOR DIFFERENCES IN SEQUENCING & GENOTYPING
The Reichlabs 1240K dataset contains genomes which were sequenced and/or genotyped using different techniques. According to our experience this can lead to artificially increased allele sharing between samples which were sequenced/genotyped using the same method versus those done using a different method. Although this does not affect qpAdm results significantly since a population can be modeled using sources that were all sequenced/genotyped using the same method, it presents an issue with qpWave analysis since a STUDY sample can have significantly different allele sharing, and thus p-value/chisq value when paired with a TARGET sample sequenced/genotyped using one technique versus when paired with the STUDY sample which was sequenced/genotyped using a different technique.
Since our qpWave cladeliness shared drift analysis is based on p-values/chisq values, this can lead to erroneous conclusions. For example, we have seen p-values significantly higher when 2 samples, A.DG and B.DG, are paired together in pleft versus when samples A.DG and B.SG are paired together in pleft, inspite of B.DG and B.SG representing the same population.
To address this situation and even-out the playing field with regards to how the individual samples were sequenced or genotyped in a dataset we introduce a technique to mitigate the differences in sequencing or genotyping. We do this by pairing each one of our STUDY & TARGET samples with an ancient and an extant population that is an outgroup (OUTGROUP A & B) to all of them in qpWave cladeliness mode and creating a table of chisq for all the results. We then obtain a multiplier for each of our STUDY & TARGET populations that is obtained as M = [( STUDY–OUTGROUP Achisq ) + ( STUDY–OUTGROUP Bchisq )] / 2
Later we calculate the cladeliness or genetic similarity (C) for each STUDY-TARGET pair as follows: Cladeliness = M / ( STUDY–TARGETchisq)
RESULTS
Early Indo-Europeanization of NW Iran and Armenia 2800-3200 years BP
Our DNA analysis indicates that in the late 2nd millennium BCE the NW region of Iran and Armenia, around Mannia and Uartu were experiencing significant changes and subjected to the 1st waves of Indo-European invasions via the Caucasus. This is consistent with findings in Lazaridis et al2 where Poltavka/Yamnaya Y-DNA haplogroup R-12149, and EHG admixture start appearing in the 3100 year old Armenian samples (Figs 3, 5, & 11). These invasions by Yamnaya related peoples would introduce the older Indo-European language currently spoken by Armenians.
Our analyses using the qpWave and qpAdm frameworks further indicates that NW Iran would be later subjected to additional waves of Indo-Iranian invasions from the Indo-Iranian homeland in Central Asia, by populations which were a BMAC-Andronovo-MLBA mixture with some East Asian. These later invasions would superimpose the Iranian languages spoken in western Iran upon the aforementioned older Indo-European languages currently spoken by Armenians.
Our DNA analyses further indicate that post Hasanlu-IA (2800 YBP), a significant signature of DNA related to to Turkic Seljuks, Ghaznavids, and Oghuzes not only in our Turkish samples but also in our Kurdish samples, and to a lesser extent in our SW Iranian and Armenian samples. This is consistent with NW Iran and the Central Asian homeland of Indo-Iranians being in contact with Turkics for a period exceeding 1000 years.
We see genetic evidence of this in the 2800 year old genomes from Tepe Hasanlu which were genetically modeled in the Lazaridis et al Supplement2 (we summarize this table in Fig 11) with up to 6% Mesolithic Eastern Hunter Gatherer (EHG) admixture, hitherto not present in Persia. Accompanying this increase in EHG admixture we observe the introduction of Y-DNA haplogroup R1b1a1b1b2 (Fig 11) which is also hitherto missing in Persia. On closer examination of Fig 11 we note that the oldest samples with Y-DNA R1b1a1b1b are the 4800-5000 year old Russian Afanasievo and Yamnaya samples. In fact, the Afanasievo, Poltavka, and Yamnaya older samples are positive for the R-M12149 subclade which we also see in two of the younger 2800 year old Iranian Hasanlu-IA samples.
Y-DNA R1b1a1b1b along with EHG admixture appears to have entered Iran via Armenia since the 3000+ year old Armenian samples are positive for both (Fig 11). Interestingly, almost all the old 10,000 year old Iranian Zagrosian Ganj-Dareh are positive for Y-DNA R2 (Fig 11). This Y-DNA haplogroup presently has the highest frequency in South Asia and is not significant in Iran indicating the migration of the ancient Iranian Zagrosian farmers from Iran to South Asia and introducing both Iranian Neolithic Farmer and Y-DNA R2 to South Asia. Thus, 10,000 year old Iranian Zagrosian farmers are ancestral to both Indians and Iranics. We speculate the migration of Ancient Zagrosian Farmers to South Asia may have introduced dravidian languages to the sub-continent.

Armenians, Kurds, Persians, and Turks are descended from Iron-Age Zagrosians and Uartians
Our extensive DNA analysis with qpAdm using the aforementioned robust set of pright reference populations to distinguish between closely related ancient West Asian populations revealed 2 closely related populations as the founding West Asian populations for contemporary Armenians, Kurds, Persians, and Turks. They are:
- 2800 year-old Uartians from Eastern Anatolia
- 2800-3500 year-old Zagrosian herders from Western Iran
We obtained robust genetics models for Armenians, Turks, Kurds and Persians when Uartians were used as the base West Asian ancestral population as follows:
- For Armenians; 2-way models using Uartians + a Central Asian population (p-values up to 0.12)
- For Kurds; 3-way models using Uartians + BMAC + a Central Asian population (p-values up 0.57)
- For Kayseri Turks; 3-way models using Uartians + 4000 year-old Minoans from Greece (representing the pre-Turkic invasion Anatolian population) + a Central Asian population (p-values as high as 0.79)
- For southwest Persians; 4-way models using Uartians + BMAC + AASI represented by Jarawans + Central Asian (p-values up to 0.06)
Genetic composition of the Uartians
Our DNA analysis indicates that Uartians were a 2-way mix of 3500 year-old Manneans from the Zagros Mountain area of NW Iran, and 4500 year-old Kura Araxes people. Kura-Araxes themselves in turn appear to be a 3-way mix of Paleolithic Caucasus Hunter Gatherers and Anatolian Farmer, along with minor Neolithic Levantine admixture as shown in Fig 12.
A logical explanation consistent with genetics, linguistic, and historical accounts of Uartians would be as follows:
- Sometime around 3000 years ago a population descended from the Kura-Araxes culture hybridized with Manneans in NW Iran to form Uartians

First phase Yamnaya related Indo-Europeans infiltrate NW Iran via the Caucasus
Our DNA analysis indicates that commencing around 2800 years ago Yamnaya descendants started infiltrating NW Iran via the Caucasus route to kick start the 1st phase of Indo-Europeanization of the Caucasus and NW Iran. These individuals introduced Yamnaya/Poltavka related paternal haplogroup R1b-M12149 into NW Iran as seen in Fig 11 which is based on the Lazaridis et al2 supplement. As per our analysis this population hybridized with the Pre-Indo_European Zagrosians to form an early Indo-European population represented by the Hasanlu Tepe genomes shown in Fig 13. Indo-Iranian Parthians, Saka, and related would later invade this area and introduce the 2nd phase of Indo-Iranianization from Central Asian and paternal haplogroups R1a-Z93/Z94/Z95 to western Iran. This paternal haplogroup appears to be most concentrated in the Indo-Iranian Kurds.

Saka related & Turkics invade West Asia
Commencing about 2000 years ago we see genetic evidence of invasion of Saka Indo-Iranian related groups and later Turkic speakers into West Asia. Around 1500 years ago Turkic speaking peoples Turkic peoples were gradually streaming into most of Central Asia from their original homeland in the Altai mountains and gradually displaced or assimilated the Iranic Saka Scythians who had ruled Central Asia for the previous 1000 years.
Turkic Karakhanids and mixed Perso-Turkic Ghaznavids occupied Khorasan and Transoxiana upon the collapse of the Persianate Samanid dynasty 1000 years ago. To the north around present-day Turkmenistan and bordering the Ghaznavids and Karakhanids was the Oghuz Yabgu State.
Starting around the year 1050 CE and for the next 800 or so years the region encompassing present-day Iran and Kurdistan would be ruled by various Turkic dynasties and Mongolic dynasties such as the Ilkhanids and Timurids (Figs 6 & 7). Thus it is not surprising that to see a genetic contribution of those groups in present-day Turks and Kurds.
To assess the genetic impact of these groups in our study populations of Kurds, Turks, Armenians, and Persians, we perform qpAdm and qpWave analysis using the aforementioned pright reference populations.
Unsurprisingly we see the highest percentage of Turkic ancestry in our Kayseri Turk samples, closely followed by our Kurd samples, with a significantly lower contribution to our Armenian and SW Persian samples as shown in Fig 14 & 16 using the 1000 year-old Karakhanid & 1300 year-old Kazakhstan Turk samples to represent Turkics. Passing p-values for these models is 0.05, with a higher p-value being proportional to a stronger genetic model.
As a sanity check we also attempt to model our study populations with present-day Turkmenistan Turkmen shown in Fig 17. Here again, we see the most robust genetic models with the highest contribution for our present-day Kurds and Turks and significantly weaker models with lower genetic contribution for Armenians and SW Persians

Also unsurprisingly, Iranics, especially Kurds formed the strongest models when using 1800 year-old Sarmatian DNA from Russia as shown in Fig 15, with Armenians and Turks forming the weakest models. Somewhat surprisingly, Kurds formed the most robust models when DNA from the 800 year-old Ghaznavid samples was used as a source as shown in Fig 18.
The Ghaznavid dynasty was a Persianate Muslim dynasty of Turkic mamluk origin which ruled the Empire of Ghazni from 977 to 1186, which at its greatest extent, extended from the Oxus to the Indus Valley. The dynasty was founded by Sabuktigin, a Turkic slave who rose to power, upon his succession to the rule of Ghazna after the death of his father-in-law, Alp Tigin, who was an ex-general of the Samanid Empire from Balkh.
Sabuktigin’s son, Mahmud of Ghazni, expanded the Ghaznavid Empire to the Amu Darya, the Indus River and the Indian Ocean in the east and to Rey and Hamadan in the west. Under the reign of Masud I, the Ghaznavid dynasty began losing control over its western territories to the Seljuk Empire after the Battle of Dandanaqan in 1040, resulting in a restriction of its holdings to modern-day Afghanistan, Pakistan and Northern India.




REFERENCES
- Paul Heggarty et al. ,Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages. Science381, eabg0818(2023).DOI:10.1126/science.abg0818
- Iosif Lazaridis et al. ,The genetic history of the Southern Arc: A bridge between West Asia and Europe.Science377, eabm4247(2022). DOI:10.1126/science.abm4247. Supplement. Tables
- Turkic Peoples Of The World, Margaret Bainbridge, 2013, pp. 149
- The most important Kurdish tribes in that region are …, Korahsunni Kurdicized Turks, southwest of Ḵoy. https://www.iranicaonline.org/articles/kurdish-tribes
- Iranicaonline:Tilakuʾi (Kurdicized Turks, around Sonnata and Zāḡa). https://www.iranicaonline.org/articles/kurdish-tribes
- Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, Genschoreck T, Webster T, Reich D. Ancient admixture in human history. Genetics. 2012 Nov;192(3):1065-93. doi: 10.1534/genetics.112.145037. Epub 2012 Sep 7. PMID: 22960212; PMCID: PMC3522152.
- Mehrdad Izady, Gulf 2000 Project, https://gulf2000.columbia.edu/maps.shtml
- Dilawer Khan, The peopling of Anatolia over the past 2000 years, https://eurasiandna.com/the-peopling-of-anatolia-post-over-the-past-2000-years
- Outcasting Armenians: Tanzimat of the Provinces, Talin Suciyan, Path to Open, 2023, pp. 84
- Harmen van der Wilt. The Genocide Convention: The Legacy of 60 Years. p. 147.
- Yeldar Barış Kalkan (2006). Çerkes halkı ve sorunları: Çerkes tarih, kültür, coğrafya ve siyasetine sınıfsal yaklaşım. p. 175.
- Caucasian battlefields: A History of the Wars on the Turco-Caucasian Border, 1828–1921. Cambridge University Press. 2011-02-17. p. 104. ISBN 978-1-108-01335-2.
- Ivanov, Vladimir (February 1926). “Notes on the Ethnology of Khurasan”. The Geographical Journal. 67 (2): 143–158. Bibcode:1926GeogJ..67..143I. doi:10.2307/1783140. JSTOR 1783140.
- Kurdish Life in Contemporary Turkey: Migration, Gender and Ethnic Identity, Anna Grabolle Celiker, p. 160, I.B.Tauris, 2013
- European Sociological Review. Vol. 18, No. 4, (December 2002), pp. 417