Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/322822562

Reconstructing the genetic history of Italians: new insights from a male (Y-
chromosome) perspective

Article  in  Annals of Human Biology · January 2018


DOI: 10.1080/03014460.2017.1409801

CITATIONS READS

8 3,576

13 authors, including:

Viola Grugni Alessandro Raveane


University of Pavia European Institute of Oncology, Milan, Italy
104 PUBLICATIONS   848 CITATIONS    46 PUBLICATIONS   88 CITATIONS   

SEE PROFILE SEE PROFILE

Vincenza Battaglia Cinzia Sala


University of Pavia San Raffaele Scientific Institute
115 PUBLICATIONS   3,110 CITATIONS    153 PUBLICATIONS   6,906 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Human Y-chromosome haplogroup Q-L275 View project

All content following this page was uploaded by Alessandro Raveane on 02 February 2018.

The user has requested enhancement of the downloaded file.


Annals of Human Biology

ISSN: 0301-4460 (Print) 1464-5033 (Online) Journal homepage: http://www.tandfonline.com/loi/iahb20

Reconstructing the genetic history of Italians: new


insights from a male (Y-chromosome) perspective

Viola Grugni, Alessandro Raveane, Francesca Mattioli, Vincenza Battaglia,


Cinzia Sala, Daniela Toniolo, Luca Ferretti, Rita Gardella, Alessandro Achilli,
Anna Olivieri, Antonio Torroni, Giuseppe Passarino & Ornella Semino

To cite this article: Viola Grugni, Alessandro Raveane, Francesca Mattioli, Vincenza Battaglia,
Cinzia Sala, Daniela Toniolo, Luca Ferretti, Rita Gardella, Alessandro Achilli, Anna Olivieri, Antonio
Torroni, Giuseppe Passarino & Ornella Semino (2018) Reconstructing the genetic history of
Italians: new insights from a male (Y-chromosome) perspective, Annals of Human Biology, 45:1,
44-56, DOI: 10.1080/03014460.2017.1409801

To link to this article: https://doi.org/10.1080/03014460.2017.1409801

View supplementary material Published online: 30 Jan 2018.

Submit your article to this journal View related articles

View Crossmark data Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=iahb20
ANNALS OF HUMAN BIOLOGY, 2018
VOL. 45, NO. 1, 44–56
https://doi.org/10.1080/03014460.2017.1409801

RESEARCH PAPER

Reconstructing the genetic history of Italians: new insights from a male


(Y-chromosome) perspective
Viola Grugnia, Alessandro Raveanea, Francesca Mattiolia, Vincenza Battagliaa, Cinzia Salab, Daniela Toniolob,
Luca Ferrettia, Rita Gardellac, Alessandro Achillia, Anna Olivieria, Antonio Torronia, Giuseppe Passarinod and
Ornella Seminoa
a
Dipartimento di Biologia e Biotecnologie “L. Spallanzani”, Universita di Pavia, Pavia, Italy; bDivisione di Genetica e Biologia Cellulare, Istituto
Scientifico San Raffaele, Milano, Italy; cDipartimento di Medicina Molecolare e Traslazionale, Universita di Brescia, Brescia, Italy;
d
Dipartimento di Biologia, Ecologia e Scienze della Terra, Universita della Calabria, Arcavacata di Rende, Cosenza, Italy

ABSTRACT ARTICLE HISTORY


Background: Due to its central and strategic position in Europe and in the Mediterranean Basin, the Received 31 July 2017
Italian Peninsula played a pivotal role in the first peopling of the European continent and has been a Revised 17 November 2017
crossroad of peoples and cultures since then. Accepted 20 November 2017
Aim: This study aims to gain more information on the genetic structure of modern Italian populations
and to shed light on the migration/expansion events that led to their formation. KEYWORDS
Subjects and methods: High resolution Y-chromosome variation analysis in 817 unrelated males from Y-chromosome variation;
10 informative areas of Italy was performed. Haplogroup frequencies and microsatellite haplotypes haplogroups; genetic
were used, together with available data from the literature, to evaluate Mediterranean and European history of Italy; Italian
inputs and date their arrivals. population
Results: Fifty-three distinct Y-chromosome lineages were identified. Their distribution is in general
agreement with geography, southern populations being more differentiated than northern ones.
Conclusions: A complex genetic structure reflecting the multifaceted peopling pattern of the
Peninsula emerged: southern populations show high similarity with those from the Middle East and
Southern Balkans, while those from Northern Italy are close to populations of North-Western Europe
and the Northern Balkans. Interestingly, the population of Volterra, an ancient town of Etruscan origin
in Tuscany, displays a unique Y-chromosomal genetic structure.

Introduction mainly spread in the Tuscan-Emilian Apennine area during


the Iron Age.
Due to its central and strategic position in Europe and in the
A better knowledge of the genetic structure of the mod-
Mediterranean Sea, the Italian Peninsula played a pivotal role
ern Italian population in the wider context of the European
in the first peopling of the continent and was, for millennia,
continent and surrounding areas is, therefore, an essential
a crossroads of peoples, cultures, trades and languages (for
element to further reconstruct its genetic history and the
details, see Supplementary File S1).
impact of ancient and more recent migration events.
Anatomically modern humans have inhabited the Italian
territory since the Upper Palaeolithic, as attested by archaeo-
logical and anthropological remains in caves like the Grotta Genetic variation of Italy
del Cavallo (Horse Cave) in Apulia, among which the oldest
(dated 45–43 thousand years ago - kya) European skeletal The first attempts to interpret the pattern of genetic variation
remains were found (Benazzi et al., 2011). During the Last of the Italian Peninsula were made with classical polymor-
Glacial Maximum (LGM), 25–19 kya, when large parts of phisms, including the ABO blood group. Piazza et al. (1988)
Europe were covered with thick ice sheets, Italy and the studied the frequencies of 34 ‘independent’ alleles at seven
Balkans were partly forested and probably acted as refuge loci (ABO, MNS, KELL, RH, HP, HLA-A and HLA-B) and their
for Europeans escaping from the North. Traces of Neolithic Principal Component (PC) analyses and synthetic maps of the
migratory events are spread all over Italy, with substantial first three PCs are generally considered the foundations for
differences (especially between North and South) in the cer- the Italian ‘genetic history’ reconstruction. These initial results
amics industry and archaeological remains. Signs of post- revealed a very distinct pattern for Sardinia, which is located
Neolithic settlements are present in different areas of the far away from all other Italian regions and represents an out-
Italian territory, for example, the Nuraghe stones in Sardinia, lier in the European genetic landscape. When Sardinia was
dated to the early Bronze Age, and the Villanovan culture, excluded from the PC analysis, the first principal component

CONTACT Ornella Semino ornella.semino@unipv.it Dipartimento di Biologia e Biotecnologie “L. Spallanzani”, Universita di Pavia, Pavia, Italy
Subject classification codes: BG.24 Evolution and Population Genetics.
Supplemental data for this article can be accessed here.
ß 2017 Informa UK Limited, trading as Taylor & Francis Group
ANNALS OF HUMAN BIOLOGY 45

synthetic map, accounting for 27% of the total variation, of Voghera and Tortona, two towns at the border between
highlighted a North to South gradient, with similarities Pavia and Alessandria provinces; 157 from the Bergamo prov-
between Northern Italy and Central Europe, in contrast to ince—78 from isolated valleys and 79 from the plain area of
the affinities between Central and Southern Italy with Greece the province), 113 from Central Italy (Volterra, a town in the
and other Mediterranean populations. core area of ancient Etruria), 350 from Southern Italy (82
Later on, numerous studies introduced uniparental from Grecıa Salentina, 102 Apulians, 93 Calabrians from the
markers to investigate the genetic structure of modern Italian Ionian Coast, 73 Calabrians from the Tyrrhenian Coast) and
populations from either the maternal or paternal side 64 from Sicily. The sampling areas are shown in Figure 1;
(Barbujani et al., 1995; Capelli et al., 2007; Destro Bisol et al., geographic and historical information on the areas is pro-
2008; Di Giacomo et al., 2003) and, in some cases, compara- vided in Supplementary File S1.
tively (Boattini et al., 2013; Brisighelli et al., 2012a, 2012b). Sampled individuals were unrelated males with the pater-
Overall, these studies suggested that the genetic structure of nal grandfather born in the collection area and/or carrying a
modern Italy reflects, at least in part, the ethnic stratification monophyletic surname with a geographical distribution within
of pre-Roman times. the sampling area as previously described (Zei et al., 2003).
From the male perspective, most of the Italian Y-chromo-
some gene pool can be related to five main haplogroups:
Ethics statement
R1b-M269, J2-M172, I-M170, G-M201 and E1b-M78. R1b is
more frequent in Northern Italy, while E1b, G and J2 harbour This research has been approved by the Ethics Committee
higher frequencies in the South, suggesting a greater affinity for Clinical Experimentation of the University of Pavia, Board
to West Europe for G and to South-East and South-Central minutes of 5 October 2010. Geographical and ethnological
Europe for J2. Indeed, in the European context, the Italian Y- information such as ethnicity, language and genealogy were
chromosome variation (Capelli et al., 2007) fits the South/ ascertained by interview of blood donors after obtaining
East-North/West cline described in previous studies, which their written informed consent.
was ascribed to the genetic admixture between incoming
Near Eastern farmers and pre-existing Mesolithic hunter-gath-
erers (Rosser et al., 2000; Semino et al., 1996, 2000, 2004). On DNA extraction
the other hand, the identification of an Anatolian (Asia All DNAs were obtained from blood samples according to
Minor) input in most of the Italian samples underlined that standard phenol/chloroform extraction procedures, followed
pre-Neolithic populations were not completely replaced by ethanol precipitation.
(Capelli et al., 2007).
Unlike the majority of authors, Boattini et al. (2013) inter-
preted the Italian distribution of Y-chromosome genetic SNP genotyping
diversity as non-clinal, but structured into three main areas
A total of 58 Y-chromosome biallelic markers were analysed
according to latitude: North-West Italy (NWI), South-East Italy
in a hierarchical way (Supplementary Table S1). They were
(SEI) and Sardinia. The outlier position of Sardinia was attrib-
genotyped by AFLP, RFLP, DHPLC and direct sequencing after
uted to the extremely high frequency of I2-M26, a sub-
PCR amplification of pertinent fragments.
branch of haplogroup I also observed in the Iberian
The nomenclature used for haplogroup labelling is in
Peninsula and Great Britain, but virtually absent in continen-
agreement with YCC conventions (The Y Chromosome
tal Italy, a finding clearly indicating a major founder event in
Consortium, 2002) and subsequent updates (Karafet et al.,
the Mediterranean island. The common genetic background
2008; King et al., 2011; Myres et al., 2011; Underhill et al.,
of Southern Italy and the Adriatic coast and the discontinuity
2010; van Oven et al., 2013).
with Northern Italy and Tuscany were explained as due to a
“belt” area, where haplogroup frequencies changed more
rapidly than in other territories, possibly dating back to the STR genotyping
Neolithic when two independent and parallel diffusion proc-
For each population, a sub-set of Y-chromosomes belonging
esses occurred along the Adriatic and Tyrrhenian coastlines.
to the most representative haplogroups was analysed for the
Our study provides novel Y-chromosome data, at a high-
microsatellite loci DYS19, DYS389I/II, DYS390, DYS391,
resolution level, and extends comparative analyses to obtain
DYS392 and DYS393 (Kayser et al., 1997) in multiplex reac-
new insights concerning the genetic history of Italy and the
tions according to STRBase information (www.cstl.nist.gov/
ancestral sources of the Italian gene pool.
biotech/strbase/y20prim.htm). The resulting STR haplotypes,
together with those available in the literature
Materials and methods (Supplementary Table S2), were used to investigate variation
and origins of haplogroups.
The sample
A total of 817 Italian individuals were newly-analysed in the
Statistical analyses
present study: 290 from Northern Italy (85 from the Borbera
Valley, an isolated valley of the Ligurian Apennine, at the Y-chromosome haplotype and haplogroup variability of the
border between Piedmont and Liguria; 48 from the territories newly-analysed populations have been considered in the
46 V. GRUGNI ET AL.

Figure 1. Geographical locations of the 10 analysed Italian population samples.

wider Italian and European/Mediterranean context, together STR variation using the method proposed by Zhivotovsky
with extended reference data available from the literature et al. (2004) and modified according to Sengupta et al.
(see Supplementary Table S3). (2006). A microsatellite evolutionary effective mutation rate
Haplogroup diversity was computed using the standard of 6.9  104 per generation (25 years) was used (Zhivotovsky
method of Nei (1987). Comparison between groups was per- et al., 2004) since it is suitable in situations where the
formed using the Chi Square Test of independence (XLStat). elapsed time frame is >1000 years or 40 generations
Genetic structure was examined through the Analysis of (Zhivotovsky et al., 2006), like the pre-historic time depths
Molecular Variance (AMOVA) (Excoffier et al., 1992) using the explored in this study. However, it is worth mentioning that
Arlequin software Ver 3.5 and adopting different geographic ambiguities related to past episodes of population history
grouping criteria. Principal Components Analysis (PCA) on (e.g. size fluctuations, bottlenecks, etc.) create inherent uncer-
haplogroup frequencies (Supplementary Tables S4 and S5) tainties in the calibration of the Y-STR molecular clock, thus
was performed with XlStat, an Excel add-in, disregarding fre- age estimates of microsatellite variation should be consid-
quencies below 5%. Median-Joining (MJ) networks (Bandelt ered with caution.
et al., 1999) were constructed using the Network 4.6.0.0 pro-
gram (Fluxus Engineering, http://www.fluxus.engineering.
com), after data processing with the reduced-median method Results and discussion
(Bandelt et al., 1995) and weighting of STR loci proportionally
Y-chromosome haplogroups in Italian populations
to the inverse of the repeat variance. Geographical represen-
tations of the haplogroup frequency and mean STR variance A total of 53 distinct lineages were identified; their frequency
distributions were obtained with Surfer 6.0 (Golden Software) distributions in the examined Italian populations are reported
following the Kriging procedure, as previously described in Supplementary Table S6 and illustrated according to their
(Battaglia et al., 2009). Maps of microsatellite variances were phylogenetic relationships in Figure 2.
obtained after pooling data for locations with less than five Haplogroup R is the most frequent (50.1%) with its two
entries and assigning the resulting values to the centroid of main branches, R1a (4.7%) and R1b (45.3%), the latter mainly
the pooled locations. Haplogroup ages were evaluated on accounted for by R1b-U152 (49.5% of the total R1b); R2 was
ANNALS OF HUMAN BIOLOGY 47

Figure 2. Haplogroup frequencies, as percentages, in the 10 analysed Italian population samples. BGV: Bergamo Valleys; BGP: Bergamo Plain; TO-VO: Tortona-
Voghera; VB: Borbera Valley; VOL: Volterra; AP: Apulia; GS: Grecıa Salentina; CAL-I: Ionian Calabria; CAL-T: Tyrrhenian Calabria; SIC: Sicily.

not observed. Next is haplogroup J (19.2%), mostly observed Northern Italy (Bergamo Valleys and plain, Tortona-
as J2 (17.6%), and third is haplogroup E, as E1b (14.6%), Voghera and Borbera Valley) is characterised by an extremely
mostly represented by its ‘Balkan’ sub-clade E1b-V13. The high incidence of the R1b haplogroup (69.0%) when com-
other main haplogroups show frequencies lower than 10%: pared to all the other main haplogroups whose frequencies
haplogroup G (8.4%) and haplogroup I (4.8%). do not reach 10%. This haplogroup, which characterises a
The frequency distributions of the three main haplogroups wide portion of the gene pool of the examined populations,
(R, J and E) and those of their sub-clades in Italian popula- shows a decreasing frequency pattern from North to South
tions are illustrated in Figure 3. The black sectors in the pri- Italy, where it shows its lowest incidence (27.5%). This pat-
mary pies are proportional to the frequency of the main tern is virtually totally ascribable to R1b-U152, the most rep-
haplogroup in the different populations, whereas the col- resented R1b sub-lineage, whereas no frequency gradients
oured sectors in the secondary pies are proportional to the were detected for the other sub-lineages. R1b-S116(xU152,
frequencies of the different sub-clades within the relative M529) is equally represented in all the Italian populations
main haplogroup. These haplogroups, which account for (Figure 3, dusty rose sector in secondary pies). This shows
more than 80% of our Italian samples, are also highly repre- the highest frequencies in two isolated areas of Northern
sented in Europe (Chiaroni et al., 2009). Frequency and vari- Italy: Borbera Valley (12.9%) and Bergamo Valleys (17.9%).
ance distribution maps (Supplementary Figures S1 and S2) The frequency peak is particularly noticeable in Bergamo
were obtained for their most informative sub-haplogroups. Valleys in comparison to the neighbouring plain area (17.9%
Networks of the associated STR-haplotypes were also con- vs 3.8%, respectively, p < .01). Global frequency and variance
structed (Supplementary Figures S1–S3) and relative coales- distributions of haplogroups R1b-U152 (not shown) and R1b-
cent ages were estimated for each population/area S116 (Supplementary Figure S1) are coherent with a North
(Supplementary Tables S7 and S8). and West European origin, respectively. Network analyses
48 V. GRUGNI ET AL.

Figure 3. Frequencies of the main Y-chromosome haplogroups E1b, J2 and R1b and their sub-clades in the 10 analysed Italian population samples. Black sectors in
the primary pies are proportional to the frequency of the main haplogroup in each population. Coloured sectors in the secondary pies are proportional to the fre-
quencies of sub-haplogroups within the relative main haplogroup.

reveal high internal complexities, especially for R1b-U152 (Battaglia et al., 2009; Heraclides et al., 2017). Taking into
(Supplementary Figure S3), due to unequal contributions of account that it is found virtually only as R1a-M17(xM458) in
different sub-lineages, as previously noted by Valverde et al. both the Southern Italian samples and in mainland Greece, it
(2016) in Spain for R1b-S116(xU152, M529) and by Boattini is likely that R1a-M17 is a signature of the Southern Balkan
et al. (2013) and the FamilyTree dataset (http://www.davidk- (mainland Greece) influence into Southern Italy. Thus, differ-
faux.org/R1b1c10_Resources.pdf) for R1b-U152. However, ently to haplogroup R1b-M412, R1a-M17 seems a hallmark
both networks are characterised by a main demographic of a significant male seaborne input from Balkan populations
expansion centred in North-West and Central-North Europe, towards the eastern coast of Southern Italy.
which strongly affected Northern Italy. Thus, taking into Unlike R1b, haplogroup J frequency increases from North
account that the highest reported incidence of R1b- (8.3%) towards Central (13.3%) and Southern Italy, where it
S116(xU152, M529) is in Iberia (Adams et al., 2008; Myres reaches the highest value (28.5%). The distribution of hap-
et al., 2011), its high frequency in the relatively isolated pop- logroup J1 is restricted to South Italy; this haplogroup, which
ulations of the Bergamo and Borbera Valleys could represent arose in the southern part of the Middle East (Malaspina
the outcome of ancient gene flow from that area, possibly et al., 2000; Nebel et al., 2002; Semino et al., 2004), character-
magnified by genetic drift. On the other hand, R1b-M412, ises Near Eastern and North African Arabic-speaking popula-
so far described only in Turkey, Iran, Cyprus and Crete (Myres tions (Al-Zahery et al., 2011; Chiaroni et al., 2009; Heraclides
et al., 2011; Voskarides et al., 2016), is observed in all the et al., 2017; Tofanelli et al., 2009). Its presence in the south-
four Southern Italian samples, all from the ancient Magna ern part of the Italian Peninsula indicates gene flow from
Graecia area, but only sporadically in population groups from these populations. In contrast, haplogroup J2, which most
Northern Italy. The R1b-M412 Y chromosomes could, there- likely arose in the northern part of the Middle East (King
fore, represent the legacy of an Eastern Mediterranean input et al., 2008; Malaspina et al., 2000; Nebel et al., 2002; Semino
associated with the early Hellenic colonisation, and/or the et al., 2004) and spread in association to Neolithic and post-
more recent Byzantine domination. This scenario is supported Neolithic migrations, is present all over the Italian territory,
by the high frequency of R1b-M412 in the Griko-speaking although with relevant differences in the distributions,
community of Grecıa Salentina (13.4%), where haplogroup probably due to more recent migratory events. J2a is most
R1b-M412 probably reflects ancient colonisation events represented in southern populations, where its J2a-M530 and
from Greek-speaking islands rather than continental Greece. J2a-Page55 sub-lineages (with a remarkable frequency of
The R1a haplogroup is observed along the entire the latter in Apulia) were the most prevalent. Its lowest fre-
Peninsula. With the exception of the Tortona-Voghera sam- quencies were registered in the isolated populations of
ple, it displays lower frequencies in the North and in the Borbera Valley (3.5%, only represented by J2a-M530) and
Centre in comparison with the southern populations, espe- Bergamo Valleys, where this clade was not observed at all.
cially those of the Ionian Coast (8.6% in Ionian Calabria, 5.9% J2a-M67, which is widely distributed in Europe, the Middle
in Apulia and 15.8% in Grecıa Salentina). R1a-M17 represents East and North Africa (Supplementary Figure S2), with a not-
an important component of the modern gene pool of able peak in Portugal, is mainly present in South Italy, espe-
Greece, where it reaches its highest frequencies (16.3% and cially in Ionian Calabria. Its variance map shows instead a
22.0%, in mainland Greece and in Thracia, respectively) different pattern, with high values in some Middle Eastern
ANNALS OF HUMAN BIOLOGY 49

regions, such as Iran, Turkey and Palestine, but also in the Mediterranean Neolithic civilisations, although the stratifica-
two main islands of the eastern Mediterranean Sea as well as tion of different migratory events could also have contrib-
along the other Mediterranean coastal regions. These data uted to the internal heterogeneity, especially in Tuscany.
indicate that this haplogroup might have spread by sea, J2b is most frequent in the Tortona-Voghera sample,
probably starting from the Middle East. The network analysis which is located in the open Po Valley, and in Apulia, which
reveals a complex internal heterogeneity as well as an expan- faces the Adriatic Sea, while it is present at low frequencies
sion that affected not only Middle Eastern populations but in the Tyrrhenian sample of Calabria and not observed in
also the Balkans and Southern Italy. The most frequent Sicily. Interestingly, its incidence in the Volterra sample is
haplotype in the network comprises subjects mainly from comparable to that observed along the Salentina Coast and,
Middle Eastern populations, including Crete and Cyprus, but as in the northern samples, it is mainly represented by the
also from Southern Italy. Notably, Northern Italians are not “Balkan” J2b-M241.
present in the central haplotype, suggesting a possible later On the whole, the heterogeneous distribution of J2 sub-
arrival to this area. The great majority of the Portuguese Y haplogroups in modern Italians highlights different diffusion
chromosomes belong to only one haplotype, thus revealing routes. Sub-haplogroups J2a-M530 and J2a-Page55, as J2a-
a very recent expansion of this lineage in western Iberia. The M67, probably mark gene flow events from the Middle East
oldest ages based on microsatellite variation are in Cyprus (they display the highest frequency and variances in Iran
(16.6 ± 4.7 kya), Crete (16.3 ± 5.5 kya) and Apulia (16.6 ± 7.0 (Grugni et al., 2012)) across the Caucasus, Turkey (data not
kya), followed by those in the Middle Eastern populations. shown), Cyprus (Voskarides et al., 2016) and Crete (King
Notably, North Sardinia, Tuscany and Sicily also have high et al., 2011) towards Southern Europe, that affected mainly
coalescent times (Supplementary Table S7). These data sug- the southern regions of Italy. Differently, J2b-M241 which dis-
gest an overall diffusion of J2a-M67 both by sea and by plays a strong expansion in the Southern Balkans (Battaglia
land. The finding of such high variance and coalescence et al., 2009; Cruciani et al., 2007; Karachanak et al., 2013) and
times in the two main islands of the Aegean Sea, character- has been associated with Neolithic and post-Neolithic migra-
ised by an early spread of agriculture, is in agreement with tions from Greece and the Balkans (Battaglia et al., 2009;
the scenario that the first steps of the Neolithic spread were King et al., 2008), marks a seaborne route whose contribution
indeed towards Cyprus and Crete (King et al., 2008). On the is still detectable along the Adriatic coast (Boattini et al.,
other hand, diffusion by land from the Middle East towards 2013) as well as in populations along the Po Valley. It is
the Southern Balkans is less likely. The high variance and worth underlining the presence of sub-haplogroup J2a-M92
coalescent time values observed in Southern Italian regions, at 7.3% in Grecıa Salentina. This value, comparable with that
overall comparable to those of the Middle East, can be the observed in the sample from Lecce (Boattini et al., 2013), is
outcome of seaborne migrations from different geographic suggestive of a direct, or Balkan-mediated, seaborne contri-
sources at different times. For example it is known that, in bution from Asia Minor (Grugni et al., 2012).
post-Neolithic times 5 kya, North West Anatolia developed Haplogroup E, mostly represented by E1b-M78, increases
a complex society engaged in a widespread Aegean trade in frequency from North (8.3%) to South, where it reaches an
referred to as “Maritime Trojan culture”, involving both the incidence of 21.3%. Its main sub-clade, E1b-V13, displays a
Western Anatolian mainland and several large islands in the decreasing frequency cline from the Southern Balkans to
Eastern Aegean Sea (Korfmann, 1997). Interestingly, J2a-M67 Western Europe (Supplementary Figure S2) and is also pre-
also harbours a high microsatellite variation age in Volterra, sent, at lower frequencies, in Anatolia and all along the
which is located in the core area of ancient Etruria. Multiple Italian Peninsula. Similar to J2b-M241, the E1b-V13 sub-clade,
hypotheses have been proposed concerning the origin of which spread from the Balkans (Battaglia et al., 2009;
Etruscans, but our observations tend to support the view Cruciani et al., 2007; Karachanak et al., 2013), is mainly
that Asia Minor was the ancestral source of the Etruscan observed in the South of Italy, with frequencies higher than
gene pool, as already proposed by Achilli et al. (2007) on the 10% in Apulia; however, unlike the Balkan J2 branch, it is
basis of mtDNA data. also found in Sicily. The distribution of its variance
J2a-M92 is widely distributed in the Middle East, the (Supplementary Figure S2) parallels the frequency clinal pat-
Balkans, and along the Mediterranean coast. In Italy, it shows tern, although high variance values are also observed in
a high frequency in the South, especially in the Southern Central-Eastern Europe and a major peak is present in
part of Apulia. The variance map shows peaks in Turkey and Anatolia. In Italy, the variance is highest in the South. The
Sicily, followed by the Southern Balkans. The highest ages highest microsatellite age estimates (Supplementary Table
based on microsatellite variation are observed in Sicily S7) are in Turkey (10.0 ± 3.4 kya), where this clade likely origi-
(11.8 ± 3.4 kya) and Turkey (11.7 ± 4.9 kya). Since the fre- nated (Battaglia et al., 2009; Cruciani et al., 2007; Karachanak
quency and variance maps suggest a possible origin of J2a- et al., 2013). Indeed, the variance is also highest in the same
M92 in and around Turkey, the observation of an age in areas. The archaeological congruence between the Greek and
Sicily that is close to that observed in Turkey may indicate an Southern Anatolia Mesolithic may explain the similar E1b-V13
ancient migration from Turkey to Sicily. Comparable high expansion times (Perl es, 2001). In Italy, E-V13 shows coales-
coalescent times based on microsatellite variation are cent age and variance values similar to the Northern Balkan
observed in Greece (7.2 ± 2.0 kya), Apulia (7.3 ± 3.1 kya) and ones. These data are in agreement with a first migration of
Tuscany (8.7 ± 2.7 kya). These data most likely testify sea- E1b-V13 from Anatolia towards the Southern Balkans, where
borne contacts of these Italian regions with Eastern it underwent a demographic expansion, followed by a later
50 V. GRUGNI ET AL.

spread towards Southern Italy (Battaglia et al., 2009). The Table 1. Gene diversity in Italian populations.
relatively recent expansion times in the Balkans are consist- BGV BGP TO-VO VB VOL AP GS CAL-I CAL-T SIC
ent with the Balkan Bronze Age, a period that saw strong 0.774 0.701 0.861 0.865 0.916 0.948 0.933 0.960 0.960 0.959
demographic changes as demonstrated by archaeological BGV: Bergamo Valleys; BGP: Bergamo Plain; TO-VO: Tortona-Voghera; VB:
records (Childe, 2013; Kristiansen, 2000), and could therefore, Borbera Valley; VOL: Volterra; AP: Apulia; GS: Grecıa Salentina; CAL-I: Ionian
Calabria; CAL-T: Tyrrhenian Calabria; SIC: Sicily.
represent a possible time frame for the population move-
ment into the South of Italy.
E1b-V13 is also observed in Volterra and the Northern the first peopling of the island or represent the signature of
Italian groups, mainly in the most accessible areas (Boattini the extensive trade exchanges that Etruscans had with
et al., 2013). This observation supports a Balkan influence in Sardinia, or alternatively could be due to the rather recent
Northern Italian populations as well, most likely through an migration of numerous shepherds from the island to
Adriatic route and along the Po Valley and, to a lesser extent Tuscany. However, taking into account that (i) all the Volterra
in lateral, more isolated, mountainous valleys. Among the I2-M26 Y chromosomes belong to the deepest and less rep-
other E sub-clades, the Middle Eastern E1b-M34 lineage is resented branches (-star- and alfa) of I2-M26 (data not
restricted to Apulia, Calabria and Sicily, whereas the North shown), not involved in the expansion of this clade in
African E1b-M81, E1b-V22 and E1b-M35 are observed in Sardinia (Francalacci et al., 2013), (ii) the carriers of these
Calabria and Sicily. In particular, E1b-M81, which is very fre- chromosomes belong to families that reside in Volterra from
quent in North Africa, reaches an incidence of 6.3% in Sicily. at least four generations and (iii) all are characterised by local
This marker has also been observed at significant frequencies monophyletic surnames, we can exclude that their presence
in Southern Iberia and its presence in Southern Europe has
in Tuscany is due to recent gene flow.
been attributed, due to its low microsatellite variation, to
Finally, haplogroup T, which arose and began to differenti-
relatively recent migration(s) from North Africa (Adams et al.,
ate in the Near East about 25 kya (Mendez et al., 2011) and
2008; Di Gaetano et al., 2009; Flores et al., 2003, 2004;
is observed at low frequencies in Europe and in parts of the
Semino et al., 2004). On the other hand, the finding of E1b-
Middle East, North and East Africa (Heraclides et al., 2017),
M35 Y chromosomes (3.5%) in the Borbera Valley is not
could be a potentially informative marker to discriminate
completely unexpected. Indeed, the so called “Vie del Sale -
movements in the Mediterranean area. In Italy, it displays fre-
Salt Paths” in the high Valley could have been the entry
quency spots in central and southern regions (Boattini et al.,
route of North-East African Y chromosomes either during the
2013) and appears sporadically in the North-West; however,
passage of the Attila Army in the 5th century and/or the
the present level of resolution does not provide any useful
numerous Saracen invasions around the year 1000 A.D.
information to better understand its diffusion.
Haplogroup G is not characterised by a clinal distribution
Gene Diversity values based on haplogroup frequencies
pattern. Frequencies higher than 10% were registered in the
for each of the analysed Italian populations are reported in
Borbera Valley (15.3%), Volterra (13.3%), Tyrrhenian Calabria
Table 1.
(12.3%) and Apulia (11.8%). Interestingly, most of the samples
Lower indexes characterise northern populations com-
belong to the sub-haplogroup G2a-L497, which, as R1b-U152,
pared to southern ones, indicating a minor Y-chromosome
displays a pattern of expansion from Central-North Europe
haplogroup variation in the North. The lowest values were
(Supplementary Figure S1). Its influence in Italy is mainly
observed in the Bergamo populations followed by those in
appreciable in the Borbera Valley, where it accounts for more
than 80% of haplogroup G, and in Volterra. On the other Tortona-Voghera and Borbera Valley, whereas the highest
hand, almost all the G clades seem to be present in Southern ones were in Calabria (with no differences among the two
Italy. The G2a-L91 branch, common among Anatolian farmers sub-sets) and Sicily, followed by Apulia and Grecıa Salentina.
of 8 kya (Lazaridis et al., 2016; Mathieson et al., 2017) and This high Y-chromosome diversity characterising Southern

characterising the Otzi’s Y chromosome with its sub-clade Italian populations is in agreement with the complex and
G2a-L166, was observed in Tyrol (Berger et al., 2013), Tuscany extensive patterns of pre- and proto-historical admixture
(Francalacci et al., 2013) and in a great portion of the recently reported for the populations of this area on the
Southern Corsican (25%) and North Sardinian (9%) Y-chromo- basis of genome-wide data (Sarno et al., 2017).
some gene pools (Keller et al., 2012). In this study, it was
recorded in one subject from the Borbera Valley and one The Italian Y-chromosome gene pool in the European
from Apulia, indicating that, although rare, the G2a-L91 hap- and Mediterranean contexts
logroup is present in the entire Peninsula.
The frequency of haplogroup I is similar in North and In order to visualise the relationships between the analysed
South Italy (4.8% and 4.1%, respectively) and higher in the groups with other Italian populations and place them in a
Centre (7.1%), with most of the samples belonging to the wider European and Mediterranean context, PCAs on hap-
I1-M253 and I2-M223 lineages. The detection of I2-M26 Y logroup frequencies were carried out exploiting available
chromosomes in Volterra is noteworthy. This sub-haplogroup data from the literature normalised to the highest possible
is known for its high frequency (>30%) in Sardinia level of phylogenetic resolution.
(Francalacci et al., 2013; Rootsi et al., 2004; Zei et al., 2003), A first PCA was performed by using the largest dataset
so its presence in the Volterra sample suggests a connection taken from the literature (Supplementary Table S4), but at a
between Tuscany and Sardinia. This link could be related to low level of haplogroup resolution, in order to compare the
ANNALS OF HUMAN BIOLOGY 51

Figure 4. Principal Components (PC) plot based on the frequencies of low-resolution Y-chromosome haplogroups (Supplementary Table S4). Numbers in parenthe-
ses indicate the proportion of the total genetic information retained by a given PC. The inset plot illustrates the contribution of each haplogroup. The three non-
Italian groupings (Middle East-Asia Minor-Caucasus, West Europe, East Europe) are circled. AG: Agrigento; AMA: Appenine Marche; AP: Apulia; AQ: L’Aquila; Bas:
Baschi; BGP: Bergamo Plain; BGV: Bergamo Valleys; BN: Benevento; BO: Bologna; BS: Brescia; Bulg: Bulgaria; CAL-I: Ionic Calabria; CAL-T: Tyrrhenian Calabria; Cau:
Caucasus; CCK: Cosenza/Catanzaro/Crotone; CE: Piceni; CE SARD: Central-East Sardinia; CMA: Central Marche; CN: Cuneo; CO: Como; CP: Campobasso; Cre: Crete; Cro:
Croatia; CS SARD: Central-South Sardinia; CT: Catania; CTU: Central Tuscany; ELB: Elba Island; ES: East Sicily; GR-SN: Grosseto/Siena; Gre: Greece; GS: Grecıa Salentina;
Iran: Iran; LE: Lecce; LIG: Liguria; Mac: Macedonia; MC: Macerata; MT: Matera; N SARD: North Sardinia; NEI: North-East Italy; NEL: North-East Latium; NWA: North-West
Apulia; PG: Foligno; PT: Pistoia; PUGR: Grecanici; RG-SR: Ragusa/Siracusa; RN: Rimini; SAN: Sanniti; SAP: South Apulia; Serb: Serbia; SIC: Sicily; SIC SW: South-West
Sicily; SLA: South Latium; SP-MS: La Spezia/Massa; SV-GE: Savona/Genova; SW: Belvedere; TLB: Tuscany Latium Border; TO-VO: Tortona-Voghera; Turk: Turkey; TV:
Treviso; UD: Udine; VB: Borbera Valley; VI: Vicenza; VLB: Badia Valley; VM: Valmarecchia; VOL: Volterra; WCL: West Calabria; WCP: West Campania; WS: West Sicily.

majority of the analysed samples. The plot of the first two spread of agriculture by Middle Eastern farmers during the
PCs, which together explain more than 29% of the variance, Neolithic period. Therefore, the low genetic distance between
is shown in Figure 4, together with an inset plot illustrating South Italian and Middle Eastern populations—and, con-
haplogroup contributions. versely, the higher distance between Middle Eastern and
The first PC (F1) which explains 16.22% of the total vari- North Italian populations—can be explained by a greater
ance, clearly separates all the Middle East, Asia Minor and influence of Middle Eastern Neolithic farmers and post-
Caucasus samples characterised by high frequencies of hap- Neolithic migrants from Eastern Mediterranean populations
logroups E, G and J, from the European ones, showing a into the South rather than in the North of Italy. On the other
high frequency of haplogroup R instead. The second PC (F2) hand, Northern Italians show a general closeness to the
explains 13.70% of the total variance and distinguishes the Basques, sharing a high incidence of the R1b clade.
Middle East and Asia Minor populations from the Caucasus Furthermore, the proximity of some North-Eastern Italian
groups based on different frequencies of haplogroups J and groups to the Balkan cluster, mainly due to haplogroups R1a-
G and Eastern from Western European groups based on M17, E1b-M78 and I-M170, suggests a Balkan contribution.
uneven R1a and R1b frequencies. Italian populations are dis- This is the case, for example, of the Udine (UD) and Vicenza
tributed among these four groupings, but with a broad lati- (VI) samples. The separation of Treviso (TV) from the geo-
tudinal separation generated by the first PC, since some graphically close Vicenza, due to the second PC, reflects
populations from the South reside close to those of North instead the high incidence of the western R1b-M269 lineage
Italy and vice versa. This distribution, even if based on low- in the former. A major Balkan influence on Vicenza rather
resolution data, underlines the variegated pattern of genetic than on Treviso could explain this observation, but it is worth
variation in Italy and is rather informative in terms of past noting that both groups are rather small (VI ¼ 40 and
migration inputs. For instance, the closeness of Southern TV ¼ 33) and that the second PC encompasses only a part of
Italians to the Middle East is due to a high frequency of hap- the genetic variance.
logroup J, which is typical of the Middle East and has been Although low-resolution haplogroups capture only some
associated with migrations from this area, including the of the total variability, regional differences are also identified
52 V. GRUGNI ET AL.

Figure 5. Principal Components (PC) plot based on the frequencies of high-resolution Y-chromosome haplogroups (Supplementary Table S5). Numbers in parenthe-
ses indicate the proportion of the total genetic information retained by a given PC. The inset plot illustrates the contribution of each haplogroup. The three non-
Italian groupings (Middle East-Asia Minor-Caucasus, West Europe, Balkans) are circled. Ala: Araba; AP: Apulia; Bba: Bizkaia; Bea: Bearn; BGP: Bergamo Plain; BGV:
Bergamo Valleys; Big: Bigorre; Boc: West Bizcaia; Bulg: Bulgaria; BUR: Burgos; CAL-I: Ionian Calabria; CAL-T: Tyrrhenian Calabria; Can: Cantabria; Cau: Caucasus; CE
SARD: Central-East Sardinia; Cha: Chalosse; Cre: Crete; Cro: Croatia; CS SARD: Central-South Sardinia; Gre: Greece; GS: Grecıa Salentina; Gso: South-West Gipuzkoa;
Gui: Gipuzkoa; Herz: Herzegovina; Mac: Macedonia; Nar: North Aragon; Nco: Central-West Nafarroa; Nla: Lapurdi Nafarroa; Nno: North-West Nafarroa; N SARD: North
Sardinia; Rio: La Rioja; Ron: Roncal: Salazar Valley; Serb: Serbia; SIC: Sicily; Sou: Zuberoa; TO-VO: Tortona-Voghera; Turk: Turkey; VB: Borbera Valley; VOL: Volterra;
Zmx: Lapurdi Baztan.

in other areas of the Peninsula. For instance, South Apulian sub-divided into two groups, one including Greece and
groups (GS, AP and LE) turned out to be far away from Bulgaria and the other encompassing all the populations
North-Western Apulia (NWA) and located between the Near from the Northern Balkans and Macedonia. It is worth noting
Eastern and Balkan population clusters, a finding that is that the Italian populations do not group together in this
strongly suggestive of a Greek influence, especially for the analysis, but are scattered in the space comprising the
Grecıa Salentina (GS). Conversely, the proximity of the “non- above-mentioned clusters. The first component separates
Grecanic” Apulians (AP) to Crete, one of the first areas Bergamo Valleys (BGV) and Central-East Sardinia (CE SARD)
reached by Neolithic Near Eastern farmers, supports the scen- from all other Italian populations for their high frequency of
ario that the same genetic stock also reached Apulia by sea. haplogroup R1b, especially R1b-S116, whereas Tortona-
The relationships between the different Italian groups in Voghera (TO-VO) and the Borbera Valley (VB) are closer to
the general context of the European and Mediterranean pop- the Balkans due to the high incidence of I2-M423, I2-M438
ulations became much clearer when the PCAs were per- and E1b-M78, markers that are largely present in Eastern
formed at a higher level of haplogroup resolution. The plot Europe. Sardinian groups have been pulled down in the plot
of the two PCs obtained from this analysis and a plot display- by the high prevalence of I2-M26 marker, also observed at
ing the contribution of each haplogroup to the first and low frequency in some Basque and Italian groups. Notably,
second PC are shown in Figure 5. while Bergamo plain (BGP) appears more related to the
On the whole, the first two PC plots explained 37.33% of Eastern European populations, Bergamo Valley (BGV) is closer
the total variance. The obtained distribution shows an overall to Basques. For a long time, Basque populations have been
general agreement with geography: while the first compo- considered the “living fossils” of the earliest modern
nent accounts for a northwest–southeast separation, the European inhabitants, both from genetic and linguistic points
second component discriminates mainly according to lati- of view (Cavalli-Sforza et al., 1994; Richards et al., 1996), but
tude. Thus, as in the previous analysis, the Basque groups are more recent studies based on ancient genome-wide sequenc-
at the opposite extreme of the first PC relative to the Middle ing data suggest that they are the results of long-lasting iso-
East, Turkey and the Caucasus cluster. Balkan populations are lation of a population group originated by the admixture of
ANNALS OF HUMAN BIOLOGY 53

local hunter-gatherers and early farmers (Gu €nther et al., of molecular variance (AMOVA) on the population samples
2015). This might be interpreted as indicating that the valley from this study, as well as those from Boattini et al. (2013).
populations of the Bergamo province were more isolated The results are summarised in Table 2.
than those inhabiting the plain and, thus, they might have Boattini et al. (2013) reported that North-West and South-
better retained traces of the Y-chromosome gene pool of East Italy are not separated according to latitude, but by a
Western European hunter-gatherers. Thus, the first compo- longitudinal line, and explained this difference by at least
nent indicates contiguity of the Y-chromosome haplogroup two independent diffusion processes involving the western
composition between Basques, North Italy (in particular with and the eastern coasts of the Italian Peninsula during the
Bergamo Valleys) and Sardinia, especially the Central-East Neolithic revolution. To evaluate this scenario, we performed
area of the island, which is known as the “archaic zone”. The first an AMOVA with both datasets and grouped the samples
Volterra sample (VOL) occupies an intermediate position, according to the previously employed criteria (Boattini et al.,
between North and South. As already suggested by the low 2013). Specifically, three groups were considered: North-West,
resolution PCA (Figure 4), different gene flows affected South-East and Sardinia. In subsequent tests, alternative
Southern Italian populations: Ionian Calabria (CAL-I) and groupings were investigated, but none produced better
Grecıa Salentina (GS) are very close to Greece and Bulgaria, results. For example, when Volterra, a potential derivative of
whereas the other samples from Apulia (AP) are closer to the ancient Etruscan people, was excluded, the variation
Crete. The differences among the Apulian populations are among groups decreased (from 10.62% to 9.32%), thus con-
explained by a dissimilar incidence of R1b-M412 and R1a- firming the pattern of variation uncovered by Boattini et al.
M17, which are considerably higher in Grecıa Salentina than (2013). However, this analysis was at a very low level of hap-
in its neighbouring areas, and of G2a-P15, which is higher in logroup resolution, which is inadequate for micro-geographic
Apulia. A great difference is also observed among Calabrian analyses, thus further AMOVA tests were carried out on our
populations: the sample from the Ionian coast (CAL-I), due to data at the maximum level of haplogroup resolution and the
the higher incidences of E1b-M78 and R1a-M17, is closer to population samples were assembled according to the high-
the Southern Balkans than the one from the Tyrrhenian coast resolution PCA grouping (Figure 5). The results are reported
(CAL-T). In the absence of data from African groups, individu- in Table 3.
als from Sicily (SIC) appear in the proximity of Central-South The first analysis was performed by considering the fol-
Sardinia (CS-SARD), the only other Italian population showing lowing three population groups: North Italy, made up by all
a considerable incidence of haplogroup E1b-M81. When a North Italian samples; South-East Italy, which included Apulia,
PCA was performed at lower resolution (data not shown) in Grecıa Salentina and Ionian Calabria; and South-West Italy,
order to include samples from North Africa, where hap- which comprises Sicily and Tyrrhenian Calabria. The analysis
logroup E1b-M81 reaches high frequencies, Sicily further sep- produced a value of variation among groups of 10.69%, simi-
arated from other Italian populations. On the whole, the lar to that (10.62%) obtained by using low-resolution hap-
high-resolution PC analysis confirms a strong Middle Eastern logroups and the population sub-division proposed by
influence in Southern Italian populations, which show Boattini et al. (2013). A higher percentage of variation among
high frequencies of haplogroups J2a and G2a (J2a-M410 and groups (11.68%) was obtained when the Borbera Valley and
G2a-P15). Volterra samples were set apart; this could indicate that
To further investigate the significance of the Italian gen- these two populations differ from the others, likely represent-
etic structure displayed by PCAs, we also carried out analyses ing genetically isolated groups. However, when only the

Table 2. AMOVA analysis with low-resolution haplogroups.


Sub-division criterion Source of variation Variance components Percentage of variation
North-West Italy vs South-East Italy vs Sardinia (3 groups) Among groups 0.39217 10.62
Among populations within groups 0.05303 1.44
Within populations 3.24769 87.94
North Italy vs South Italy vs Central Italy vs Sardinia (4 groups) Among groups 0.33912 9.32
Among populations within groups 0.05154 1.42
Within populations 3.24769 89.26
p < .01.

Table 3. AMOVA analysis with high-resolution haplogroups.


Sub-division criterion Source of variation Variance components Percentage of variation
North Italy vs South-East Italy vs South-West Italy (3 groups) Among groups 0.56305 10.69
Among populations within groups 0.10577 2.01
Within populations 4.60023 87.31
North Italy vs South Italy vs Borbera Valley vs Volterra (4 groups) Among groups 0.61477 11.68
Among populations within groups 0.04669 0.89
Within populations 4.60023 87.43
North Italy vs South Italy vs Volterra (3 groups) Among groups 0.65349 12.31
Among populations within groups 0.05669 1.07
Within populations 4.60023 86.63
p < .05;  p < .01.
54 V. GRUGNI ET AL.

Volterra sample was kept separate and the Borbera Valley haplogroup R1b-U152 could prove very informative in this
was pooled with the northern samples, the among-group regard.
variation increased, reaching the highest value among
groups (12.31%), while the variance among populations
within groups remained almost the same. This suggests a
Acknowledgements
notable difference among the three groups and, in particular, The authors are grateful to all the donors for providing biological speci-
underlines a significantly different Y-chromosome haplogroup mens and acknowledge two anonymous reviewers for valuable sugges-
tions and comments on the manuscript. This study is part of the
composition between modern Volterra and the rest of Italy.
University of Pavia strategic theme ‘Towards a governance model for
Since the population from Volterra could be, at least in part, international migration: an interdisciplinary and diachronic perspective’
of Etruscan ancestry, it would be interesting to verify if other (MIGRAT-IN-G) (to A.O., A.A., O.S. and A.T.).
populations from Tuscany (analysed at high-resolution level)
show the same peculiarity. If so, our observation would be in
line with the scenario of a foreign source for the ancient Disclosure statement
Etruscan people, instead of an autochthonous origin. The authors report no conflicts of interest. The authors alone are respon-
sible for the content and writing of the paper.

Conclusions
Funding
In this paper, Y chromosomes of 817 subjects from inform-
ative areas of the Italian Peninsula were analysed at a high This work was funded by Ministero dell'Istruzione, dell'Università e della
Ricerca [RBFR126B8I (to A.O. and A.A.)].
level of haplogroup resolution. The results, compared with
those available from the literature, provided a more detailed
overview of the Y-chromosome variation in Italy relative to References
previous studies. A genetic structure characterised by high
Achilli A, Olivieri A, Pala M, Metspalu E, Fornarino S, Battaglia V,
complexity emerged, probably reflecting the multifaceted Accetturo M, et al. 2007. Mitochondrial DNA variation of modern
pattern of peopling of the Italian Peninsula. Specifically, the Tuscans supports the Near Eastern origin of Etruscans. Am J Hum
southern groups are characterised by a higher haplogroup Genet 80:759–768.
variation in comparison with those from the North. This is Adams SM, Bosch E, Balaresque PL, Ballereau SJ, Lee AC, Arroyo E,
Lopez-Parra AM, et al. 2008. The genetic legacy of religious diversity
well illustrated by the AMOVA analysis with ‘high-resolution
and intolerance: paternal lineages of Christians, Jews, and Muslims in
’haplogroups, which provided the best results when three the Iberian Peninsula. Am J Hum Genet 83:725–736.
groupings were considered: Northern Italy, Southern Italy Al-Zahery N, Pala M, Battaglia V, Grugni V, Hamod MA, Hooshiar Kashani
and Volterra sample. Thus, the population groups are better B, Olivieri A, et al. 2011. In search of the genetic footprints of
separated according to their latitude rather than their longi- Sumerians: a survey of Y-chromosome and mtDNA variation in the
Marsh Arabs of Iraq. BMC Evol Biol 11:288.
tude, as proposed by Boattini et al. (2013), and also con- Bandelt HJ, Forster P, Ro €hl A. 1999. Median-joining networks for inferring
firmed by our analysis at a comparable low-resolution level. intraspecific phylogenies. Mol Biol Evol 16:37–48.
However, isolated populations such as those from the Bandelt HJ, Forster P, Sykes BC, Richards MB. 1995. Mitochondrial por-
Borbera Valley and Grecıa Salentina still preserve unique hap- traits of human populations using median networks. Genetics
141:743–753.
logroup distributions, due to either genetic drift and/or the
Barbujani G, Bertorelle G, Capitani G, Scozzari R. 1995. Geographical
legacy of distinctive ancestral sources. structuring in the mtDNA of Italians. Proc Natl Acad Sci USA
When compared to other populations, Italian samples do 92:9171–9175.
not cluster all together, but are distributed among European Battaglia V, Fornarino S, Al-Zahery N, Olivieri A, Pala M, Myres NM, King
and Mediterranean people. Southern samples show a higher RJ, et al. 2009. Y-chromosomal evidence of the cultural diffusion of
agriculture in southeast Europe. Eur J Hum Genet 17:820–830.
similarity with Middle Eastern and Southern Balkan popula- Benazzi S, Douka K, Fornai C, Bauer CC, Kullmer O, Svoboda J, Pap I,
tions than northern ones; conversely, northern samples are et al. 2011. Early dispersal of modern humans in Europe and implica-
genetically closer to North-West Europe and Northern Balkan tions for Neanderthal behaviour. Nature 479:525–528.
groups. The intermediate position of Volterra, between South Berger B, Niederst€atter H, Erhart D, Gassner C, Schennach H, Parson W.
2013. High resolution mapping of Y haplogroup G in Tyrol (Austria).
and North Italy, is a mark of its unique Y-chromosomal gen-
Forensic Sci Int Genet 7:529–536.
etic structure. However, the long-lasting debate concerning Boattini A, Martinez-Cruz B, Sarno S, Harmant C, Useli A, Sanz P, Yang-
the origin of Etruscans remains open. As a matter of fact, Yao D, et al. 2013. Uniparental markers in Italy reveal a sex-biased
while the presence of J2a-M67 suggests contacts by sea genetic structure and different historical strata. PLoS One 8:e65441.
with Anatolian people, in agreement with the Herodotus Brisighelli F, Blanco-Verea A, Boschi I, Garagnani P, Pascali VL, Carracedo
A, Capelli C, Salas A. 2012a. Patterns of Y-STR variation in Italy.
hypothesis of an external Anatolian source of Etruscans, the
Forensic Sci Int Genet 6:834–839.
finding of the Central European lineage G2a-L497 at consid- 
Brisighelli F, Alvarez-Iglesias V, Fondevila M, Blanco-Verea A, Carracedo A,
erable frequency would rather support a Northern European Pascali VL, Capelli C, Salas A. 2012b. Uniparental markers of contem-
origin of Etruscans. On the other hand, the high incidence of porary Italian population reveals details on its pre-Roman heritage.
European R1b lineages cannot rule out the scenario of an PLoS One 7:e50794.
Capelli C, Brisighelli F, Scarnicci F, Arredi B, Caglia’ A, Vetrugno G,
autochthonous process of formation of the Etruscan civilisa- Tofanelli S, et al. 2007. Y chromosome genetic variation in the Italian
tion from the preceding Villanovan society, as first suggested peninsula is clinal and supports an admixture model for the
by Dionysius of Halicarnassus; a detailed analysis of Mesolithic-Neolithic encounter . Mol Phylogenet Evol 44:228–239.
ANNALS OF HUMAN BIOLOGY 55

Cavalli-Sforza LL, Menozzi P, Piazza A. 1994. The history and geography Kristiansen K. 2000. Europe before history. Cambridge: Cambridge
of human genes. Princeton, NJ: Princeton University Press. University Press.
Chiaroni J, Underhill PA, Cavalli-Sforza LL. 2009. Y chromosome diversity, Lazaridis I, Nadel D, Rollefson G, Merrett DC, Rohland N, Mallick S,
human expansion, drift, and cultural evolution. Proc Natl Acad Sci Fernandes D, et al. 2016. Genomic insights into the origin of farming
USA 106:20174–20179. in the ancient Near East. Nature 536:419–424.
Childe VG. 2013. The dawn of European civilization. Abingdon: Malaspina P, Cruciani F, Santolamazza P, Torroni A, Pangrazio A, Akar N,
Routledge. Bakalli V, et al. 2000. Patterns of male-specific inter-population diver-
Cruciani F, La Fratta R, Trombetta B, Santolamazza P, Sellitto D, Colomb gence in Europe, West Asia and North Africa. Ann Hum Genet
EB, Dugoujon JM, et al. 2007. Tracing past human male movements 64:395–412.
in northern/eastern Africa and western Eurasia: new clues from Mathieson I, Roodenberg SA, Posth C, Szecsenyi-Nagy A, Rohland N,
Y-chromosomal haplogroups E-M78 and J-M12. Mol Biol Evol Mallick S, Olalde I, et al. 2017. The genomic history of southeastern
24:1300–1311. Europe. bioRxiv.135616.
Destro Bisol G, Anagnostou P, Batini C, Battaggia C, Bertoncini S, Boattini Mendez FL, Karafet TM, Krahn T, Ostrer H, Soodyall H, Hammer MF. 2011.
A, Caciagli L, et al. 2008. Italian isolates today: geographic and linguis- Increased resolution of Y-chromosome haplogroup T defines relation-
tic factors shaping human biodiversity. J Anthropol Sci 86:179–188. ships among populations of the Near East, Europe, and Africa. Hum
Di Gaetano C, Cerutti N, Crobu F, Robino C, Inturri S, Gino S, Guarrera S, Biol 83:39–53.
et al. 2009. Differential Greek and northern African migrations to Sicily Myres NM, Rootsi S, Lin AA, J€arve M, King RJ, Kutuev I, Cabrera VM, et al.
are supported by genetic evidence from the Y chromosome. Eur J 2011. A major Y-chromosome haplogroup R1b Holocene era founder
Hum Genet 17:91–99. effect in Central and Western Europe. Eur J Hum Genet 19:95–101.
Di Giacomo F, Luca F, Anagnou N, Ciavarella G, Corbo RM, Cresta M, Nebel A, Landau-Tasseron E, Filon D, Oppenheim A, Faerman M. 2002.
Cucci F, et al. 2003. Clinal patterns of human Y chromosomal diversity Genetic evidence for the expansion of Arabian tribes into the
in continental Italy and Greece are dominated by drift and founder Southern Levant and North Africa. Am J Hum Genet 70:1594–1596.
effects. Mol Phylogenet Evol 28:387–395. Nei M. 1987. Molecular evolutionary genetics. New York: Columbia
Excoffier L, Smouse PE, Quattro JM. 1992. Analysis of molecular variance University Press.
inferred from metric distances among DNA haplotypes: application to Perles C. 2001. The early Neolithic in Greece: the first farming commun-
human mitochondrial DNA restriction data. Genetics 131:479–491. ities in Europe. Cambridge: Cambridge University Press.
Flores C, Maca-Meyer N, Gonzalez AM, Oefner PJ, Shen P, Perez JA, Rojas Piazza A, Cappello N, Olivetti E, Rendine S. 1988. A genetic history of
A, et al. 2004. Reduced genetic structure of the Iberian peninsula Italy. Ann Hum Genet 52:203–213.
revealed by Y-chromosome analysis: implications for population dem- Richards M, Co ^rte-Real H, Forster P, Macaulay V, Wilkinson-Herbots H,
ography. Eur J Hum Genet 12:855–863. Demaine A, Papiha S, et al. 1996. Paleolithic and Neolithic lineages in
Flores C, Maca-Meyer N, Perez JA, Gonzalez AM, Larruga JM, Cabrera VM.
the European mitochondrial gene pool. Am J Hum Genet 59:185–203.
2003. A predominant European ancestry of paternal lineages from Rootsi S, Magri C, Kivisild T, Benuzzi G, Help H, Bermisheva M, Kutuev I,
Canary islanders. Ann Hum Genet 67:138–152.
et al. 2004. Phylogeography of Y-chromosome haplogroup I reveals
Francalacci P, Morelli L, Angius A, Berutti R, Reinier F, Atzeni R, Pilu R,
distinct domains of prehistoric gene flow in Europe. Am J Hum Genet
et al. 2013. Low-pass DNA sequencing of 1200 Sardinians reconstructs
75:128–137.
European Y-chromosome phylogeny. Science 341:565–569.
Rosser ZH, Zerjal T, Hurles ME, Adojaan M, Alavantic D, Amorim A, Amos
Grugni V, Battaglia V, Hooshiar Kashani B, Parolo S, Al-Zahery N, Achilli A,
W, et al. 2000. Y-chromosomal diversity in Europe is clinal and influ-
Olivieri A, et al. 2012. Ancient migratory events in the Middle East:
enced primarily by geography, rather than by language. Am J Hum
new clues from the Y-chromosome variation of modern Iranians. PLoS
Genet 67:1526–1543.
One 7:e41252.
Sarno S, Boattini A, Pagani L, Sazzini M, De Fanti S, Quagliariello A,
G€unther T, Valdiosera C, Malmstro €m H, Uren ~a I, Rodriguez-Varela R,
 Daskalaki EA, et al. 2015. Ancient genomes link early Gnecchi Ruscone GA, et al. 2017. Ancient and recent admixture layers
Sverrisdottir O,
in Sicily and Southern Italy trace multiple migration routes along the
farmers from Atapuerca in Spain to modern-day Basques. Proc Natl
Mediterranean. Sci Rep 7:1984
Acad Sci USA 112:11917–11922.
Semino O, Magri C, Benuzzi G, Lin AA, Al-Zahery N, Battaglia V, Maccioni
Heraclides A, Bashiardes E, Fernandez-Domınguez E, Bertoncini S,
Chimonas M, Christofi V, King J, et al. 2017. Y-chromosomal analysis L, et al. 2004. Origin, diffusion, and differentiation of Y-chromosome
of Greek Cypriots reveals a primarily common pre-Ottoman paternal haplogroups E and J: inferences on the neolithization of Europe and
ancestry with Turkish Cypriots. PLoS One 12:e0179474. later migratory events in the Mediterranean area. Am J Hum Genet
Karachanak S, Grugni V, Fornarino S, Nesheva D, Al-Zahery N, Battaglia V, 74:1023–1034.
Carossa V, et al. 2013. Y-chromosome diversity in modern Bulgarians: Semino O, Passarino G, Brega A, Fellous M, Santachiara-Benerecetti AS.
new clues about their ancestry. PLoS One 8:e56779. 1996. A view of the Neolithic demic diffusion in Europe through two
Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Hammer Y chromosome-specific markers. Am J Hum Genet 59:964–968.
MF. 2008. New binary polymorphisms reshape and increase resolution Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S, Beckman LE, De
of the human Y chromosomal haplogroup tree. Genome Res Benedictis G, et al. 2000. The genetic legacy of Paleolithic Homo sapi-
18:830–838. ens sapiens in extant Europeans: a Y-chromosome perspective.
Kayser M, de Knijff P, Dieltjes P, Krawczak M, Nagy M, Zerjal T, Pandya A, Science 290:1155–1159.
et al. 1997. Applications of microsatellite-based Y chromosome haplo- Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CE,
typing. Electrophoresis 18:1602–1607. Lin AA, et al. 2006. Polarity and temporality of high-resolution Y-
Keller A, Graefen A, Ball M, Matzas M, Boisguerin V, Maixner F, Leidinger chromosome distributions in India identify both indigenous and
P, et al. 2012. New insights into the Tyrolean Iceman’s origin and exogenous expansions and reveal minor genetic influence of Central
phenotype as inferred by whole-genome sequencing. Nat Commun Asian pastoralists. Am J Hum Genet 78:202–221.
3:698. The Y Chromosome Consortium. 2002. A nomenclature system for the tree
King RJ, Di Cristofaro J, Kouvatsi A, Triantaphyllidis C, Scheidel W, Myres of human Y-chromosomal binary haplogroups. Genome Res 12:339–348.
NM, Lin AA, et al. 2011. The coming of the Greeks to Provence and Tofanelli S, Ferri G, Bulayeva K, Caciagli L, Onofri V, Taglioli L, Bulayev O,
Corsica: Y-chromosome models of archaic Greek colonization of the et al. 2009. J1-M267 Y lineage marks climate-driven pre-historical
western Mediterranean. BMC Evol Biol 11:69. human displacements. Eur J Hum Genet 17:1520–1524.
King RJ, Ozcan SS, Carter T, Kalfog  lu E, Atasoy S, Triantaphyllidis C, Underhill PA, Myres NM, Rootsi S, Metspalu M, Zhivotovsky LA, King RJ,
Kouvatsi A, et al. 2008. Differential Y-chromosome Anatolian influen- Lin AA, et al. 2010. Separating the post-glacial coancestry of European
ces on the Greek and Cretan Neolithic. Ann Hum Genet 72:205–214. and Asian Y chromosomes within haplogroup R1a. Eur J Hum Genet
Korfmann M. 1997. Troia: Ausgrabungen 1996. Zabern. 18:479–484.
56 V. GRUGNI ET AL.

Valverde L, Illescas MJ, Villaescusa P, Gotor AM, Garcıa A, Cardoso consistent with Neolithic and Bronze Age settlements. Investig
S, Algorta J, et al. 2016. New clues to the evolutionary history Genet 7:1.
of the main European paternal lineage M269: dissection of the Zei G, Lisa A, Fiorani O, Magri C, Quintana-Murci L, Semino O,
Y-SNP S116 in Atlantic Europe and Iberia. Eur J Hum Genet Santachiara-Benerecetti AS. 2003. From surnames to the history of Y
24:437–441. chromosomes: the Sardinian population as a paradigm. Eur J Hum
van Oven M, Toscani K, van den Tempel N, Ralf A, Kayser M. 2013. Genet 11:802–807.
Multiplex genotyping assays for fine-resolution subtyping of the Zhivotovsky LA, Underhill PA, Cinnioglu C, Kayser M, Morar B, Kivisild T,
major human Y-chromosome haplogroups E, G, I, J and R in anthropo- Scozzari R, et al. 2004. The effective mutation rate at Y chromosome
logical, genealogical, and forensic investigations. Electrophoresis short tandem repeats, with application to human population-diver-
34:3029–3038. gence time. Am J Hum Genet 74:50–61.
Voskarides K, Mazieres S, Hadjipanagi D, Di Cristofaro J, Ignatiou A, Zhivotovsky LA, Underhill PA, Feldman MW. 2006. Difference between
Stefanou C, King RJ, et al. 2016. Y-chromosome phylogeographic evolutionarily effective and germ line mutation rate due to stochastic-
analysis of the Greek-Cypriot population reveals elements ally varying haplogroup size. Mol Biol Evol 23:2268–2270.

View publication stats

You might also like