Professional Documents
Culture Documents
MSR 138
MSR 138
MSR 138
Research
Downloaded from https://academic.oup.com/mbe/article/28/11/3069/1045931
Immunologie, Institut Universitaire d’Hématologie, Hôpital Saint-Louis, Paris, France
*Corresponding author: E-mail: erick.castelli@gmail.com.
Associate editor: Yoko Satta
Abstract
article
HLA-G molecule plays an important role on immune response regulation and has been implicated on the inhibition of T
and natural killer cell cytolytic function and inhibition of allogeneic T-cell proliferation. Due to its immune-modulator
properties, the HLA-G gene expression has been associated with the outcome of allograft and of autoimmune, infectious,
and malignant disorders. Several lines of evidence indicate that HLA-G polymorphisms at the 5#-upstream regulatory
region (5# URR) and 3#-untranslated region (3# UTR) may influence the HLA-G expression levels. Because Brazilians
represent one of the most heterogeneous populations in the world with the widest HLA-G coding region variability already
detected among the studied populations, a high level of variability and haplotype diversity would be expected in Brazilians.
On this basis, the 5# URR, coding, and 3# UTR variability were evaluated in a Brazilian series consisting of 100 healthy bone
marrow donors, as well as the linkage disequilibrium pattern along the gene and the extended haplotypes encompassing
several gene segment variations. The HLA-G locus seems to present six different HLA-G lineages showing functional
variations mainly in nucleotides of the regulatory regions. Differences were observed at the 5# URR in positions that either
coincide with or are close to transcription factor–binding sites and at the 3# UTR mainly in positions that have already
been reported to influence HLA-G mRNA availability. We report several lines of evidence for balancing selection acting on
the regulatory regions, which may indicate that these HLA-G lineages may be related to the differential HLA-G expression
profiles.
Key words: HLA-G, 5# upstream regulatory region (5#URR), promoter, 3# untranslated region (3# UTR), polymorphism,
haplotypes, Brazilians.
© The Author 2011. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: journals.permissions@oup.com
Mol. Biol. Evol. 28(11):3069–3086. 2011 doi:10.1093/molbev/msr138 Advance Access publication May 28, 2011 3069
Castelli et al. · doi:10.1093/molbev/msr138 MBE
ILT-2 (LILRB1 and CD85j), ILT-4 (LILRB2 and CD85d), and Materials and Methods
KIR2DL4 (CD158d) leukocyte receptors (Colonna et al.
1997; Cosman et al. 1997; Colonna et al. 1998; Allan Subjects
et al. 1999; Ponte et al. 1999; Rajagopalan and Long The Local Research Ethics Committee approved the pro-
1999). Dendritic cell function is also inhibited by interaction tocol of the study (Protocol HCRP #12398/2004), and all
with ILT-2 and ILT-4 receptors (Carosella and Horuzsko participants gave written informed consent before blood
2007), which also interact with other class I HLA molecules, withdrawal. A total of 100 healthy unrelated bone marrow
but the highest affinity is for HLA-G (Brown et al. 2004; donors from different regions of the State of São Paulo,
Carosella, Favier et al. 2008; Donadi et al. 2011). Southeastern Brazil, were randomly selected and their
In contrast to the classical HLA class I loci, limited HLA-G DNA was obtained using a salting-out procedure (Miller
coding region variability has been observed in worldwide et al. 1988).
populations (Castelli, Mendes-Junior et al. 2007). Based
on the gametic phase (haplotypes) of 73 single-nucleotide HLA-G Promoter Region Polymorphism Assignment
3070
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
sampled chromosomes (Tan et al. 2005)]; thus, these variation (Barrett et al. 2005). Given the positive association but un-
sites were not included in our study. known gametic phase, the PHASE (http://stephenslab.
uchicago.edu/software.html) method (Stephens et al.
HLA-G Coding and 3# UTR Characterization 2001), implemented by the PHASE v2 package (Mac OS
For the HLA-G coding region polymorphism assignment, all version) (Stephens et al. 2001), was used to assign the most
individuals had their HLA-G alleles defined by nucleotide se- probable haplotype constitution of each sample.
quence variations encompassing exons 1–4 (approximately The database containing the variation sites was loaded in-
1975 bp), as described elsewhere (Castelli et al. 2010). The to the PHASE software and the analyses were performed to-
complete sequence between the first base of exon 1 and gether with the following parameters using six independent
the last base of exon 2 (hence including intron 1) and the runs with different seed values: 1) number of interactions:
complete sequence of exons 3 and 4 were evaluated, encom- 1000, 2) thinning interval: 1, and 3) burn-in value: 100. Inde-
passing all the variation sites already described for these re- pendent runs yielded the same results for all samples, except
gions (IMGT database version 3.1.0, July 2010). Additionally, for four samples, in which a rare SNP allele with just one oc-
3071
Castelli et al. · doi:10.1093/molbev/msr138 MBE
genotypic proportions to expectations under Hardy– MAF of 10% (fig. 1B). It can be noticed that, with the
Weinberg equilibrium were tested by the exact test of exception of the SNP at the position þ3035 in the 3#
Guo and Thompson (Guo and Thompson 1992) employing UTR region, most of the remaining variations sites (11 for
the GENEPOP 3.4 software (http://genepop.curtin.edu.au/) coding, 12 for promoter, and 7 for the 3# UTR region) did
(Raymond and Rousset 1995). The expected heterozygosity present strong LD with each other. The lack of LD concerning
values (hSk) and haplotype diversity as well as their stan- the þ3035 SNP could be due to the occurrence of two in-
dard deviations were estimated by the ARLEQUIN 3.11 pro- dependent C . T mutations at such position in very different
gram (Excoffier et al. 2005). Haplotype networks were HLA-G lineages (HG0103 and HG010103, at opposite sides of
constructed by using the median joining algorithm imple- fig. 2). Therefore, it can be observed that a strong LD is main-
mented in the Network 4.6.0.0 program (http:// tained throughout the whole gene. The promoter SNP at the
www.fluxus-engineering.com/sharenet.htm) (Bandelt 725 position (C/G/T) was removed from both plots be-
et al. 1995). Tajima’s D and Fu and Li’s F neutrality tests cause Haploview only accepts biallelic SNPs. Nevertheless,
were performed using DNAsp 5.1 (http://www.ub. the LD test performed by Arlequin did corroborate the Hap-
3072
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
FIG. 1. LD between pairs of SNPs at the HLA-G locus. The image was generated in the Haploview program using SNPs with frequency .2% (A)
and .10% (B). Areas in dark red indicate strong LD (LOD 2, D# 5 1), shades of pink indicate moderate LD (LOD 2, D# , 1), blue indicates
weak LD (LOD , 2, D# 5 1), and white indicates no LD (LOD , 2, D# , 1). The haplotype blocks were defined by the confidence intervals
method implemented by Haploview. D# values different from 1.00 are represented inside the squares as percentages. LOD, log of the odds; D#,
pairwise correlation between SNPs.
alleles that share the A/T mutation at the position þ748 HG0103 lineage. The sequence of each haplotype, the pro-
at exon 3 (G*01:01:03:01, G*01:01:03:02, and G*01:01:05). posed consensus sequence for each HLA-G main lineage
The last lineage, HG010108, was found in two individuals and its frequencies are illustrated in figure 3. One haplotype
and it presents the promoter of the HG010102 lineage, the found (H27) did not fit the pattern of haplotype clusters
G*01:01:08 coding allele and the 3# UTR haplotype of the observed in the network (fig. 2), and it was considered in
3073
Castelli et al. · doi:10.1093/molbev/msr138 MBE
figure 3 and table 1 as the results of possible crossing-over (ENST00000360323, ENST00000428952, ENST0000437467,
events between haplotypes from different lineages. In ad- ENST00000383621, and ENST00000411831), the HLA-G
dition, although haplotypes H07 and H09 were included in 5# UTR sequence (transcribed but not translated) vary be-
the sublineages HG010101a and HG010101b, respectively, tween transcripts, most of them presenting a 5# UTR of 24
they seem to be the results of crossing-over events between nucleotides before the ATG in exon 1 (positions 25 to 1).
haplotypes from different HG010101 sublineages. Given that, all the variation sites evaluated in the present
In table 2, it is evident that 13 haplotypes were found just manuscript in the promoter sequence must be considered
once in the present series. However, some of these haplotypes as variations of a 5# URR sequence (upstream regulatory, not
were confirmed by cloning (H07 and H18) or were present in transcribed).
allele combinations that facilitate the phase inference, such as In order to understand the variability and the haplotype
H25, that was found in a sample with homozygosis in all the structure of the 5# URR of the HLA-G locus, the haplotypes
variation sites except for the position 443. In addition, at of the promoter region were extracted from the 28 ex-
least five haplotypes were compatible with rare alleles that tended haplotypes illustrated in figure 3. Considering only
were found in some populations and now in Brazil, such the promoter region, we identified 25 variation sites [23
as G*01:01:09 and G*01:04:05, all five with sequences that SNPs and 2 insertion/deletion (indel)] in the present sam-
are compatible with the ones described in IMGT. The remain- ple (table 1), encompassing the 1100 promoter nucleotides
ing five haplotypes that occurred just once were found in evaluated as described above. All these 25 variation sites
samples carrying very common haplotypes such as H01 have been already described in previous reports (Tan
and H10, which facilitate the PHASE inference, which resulted et al. 2005; Rizzo et al. 2008; Berger et al. 2010). It is note-
in probabilities ranging from 98.0% to 99.8%. Given these evi- worthy to mention that, exception made for the SNP at
dences, we believe that these haplotype inferences are quite the 443 position, all variation sites did present allele fre-
robust, even for these 13 haplotypes that occurred just once. quencies higher than 2% in the present series. The two in-
dels consist of an insertion of an additional guanine in
HLA-G Promoter Sequence Variation and a series of five between the 542 and 538 positions
Haplotype Structure and a deletion of an adenine in a series of eight adenines
By taking together the known transcripts for the full-length between the 537 and 530 positions (Hviid et al. 2006;
HLA-G protein available in the Ensembl Project database Rizzo et al. 2008). To avoid conflict with previous data, we
3074
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
used the nomenclature proposed by Rizzo and colleagues, each group of lineages. It is interesting to note that two
in which the guanine insertion was considered at the 540 major promoter lineages (PROMO-G010101 and PRO-
position and the adenine deletion was considered at the MO-G010102) account for 77.5% of the promoter haplo-
533 position. The nucleotide diversity of the promoter types, with a distribution of 32.0% for PROMO-G010102
region was 0.00692, almost the same as for the entire and of 45.5% for PROMO-G010101. The haplotypes from
HLA-G locus (0.00643). these two promoter lineages differ from each other in at
The haplotyping analysis using the PHASE method re- least 12 variation sites (table 1), most of them located near
vealed the presence of 11 different HLA-G promoter hap- to or within transcription factor–binding sites (Tan et al.
lotypes (considering all the variation sites and the first SNP 2005; Moreau et al. 2009).
of exon 1 at the þ15 position), with frequencies ranging Two novel HLA-G promoter haplotypes were found,
from 0.5% to 32.0% (table 1) and a haplotype diversity named PROMO-G010101f and PROMO-G0103e. PROMO-
of 0.816. Because these promoter haplotypes were similar G010101f presents an adenine deletion at the 533 position
to those described by Tan et al. (2005), we used the same and PROMO-G0103e presents a guanine insertion at the
nomenclature as proposed by these authors. As stated by 540 position (table 1). Although we have not found known
Tan and colleagues, the 12 haplotypes seem to be arranged haplotypes such as PROMO-G0103b and G0103c, it is pos-
in four promoter lineages with very few variations inside sible that our new promoter haplotype PROMO-G0103e is
3075
Castelli et al. · doi:10.1093/molbev/msr138 MBE
NOTE.—Four different lineages of haplotypes were found, grouped by their similarity. Differences among the haplotypes are given in shades of gray. ‘‘I’’ denotes an insertion of a guanine at position 540. ‘‘-’’ denotes a deletion of adenine at
21306 21179 21155 21140 21138 21121 2762 2725 2716 2689 2666 2646 2633 2540 2533 2509 2486 2483 2477 2443 2400 2391 2369 2201 256 15a (2n 5 200)
the same PROMO-G0103b haplotype as described by the
Frequency
0.135
0.225
0.070
0.045
0.045
0.020
0.040
0.025
0.005
0.320
0.070
Ober’s group (Tan et al. 2005) because it was described lack-
The variation at the position þ 15 is in the coding sequence in exon 1 and must not be considered as 5# URR. This SNP was used in this table to better characterize the haplotypes accordingly to Tan and Colleagues, 2005.
ing information about the guanine insertion at the 540
position. Despite this insertion, PROMO-G0103e is exactly
G
G
G
G
G
G
G
G
C A
C A
A
the same haplotype as PROMO-G0103b but with no gua-
nine insertion compared with the other members of the
C
C
C
C
C
C
T
T
T
G0103 promoter lineage.
G
G
G
G
G
G
G
G
A
A
A
A
A
C
C
C
C
C
Haplotype Structure
G
G
G
G
G
G
G
A
A
A
Among the currently described 17 allele groups (15 encoding
different proteins and 2 null alleles), 6 were found in our study:
G
G
G
G
G
G
G
A
A
A
HLA-G*01:01 with eight alleles (G*01:01:01:01, G*01:01:01:04,
G
G
G
G
G
G
G
G
A
G
G
G
C
C
C
C
C
A
A
A
A
A
A
A
A
A
C
C
G
C
C
C
C
C
C
C
A
A
A
A
A
A
A
HLA-G Promoter SNP Position
G
G
G
G
G
G
I
I
G
G
G
A
A
A
A
A
A
A
A
A
G
G
G
T
T
G
G
A
A
A
A
A
A
A
A
T
T
T
T
T
T
T
T
C
C
T
T
T
C
C
C
T
T
C
C
C
C
C
C
C
T
A
A
A
A
A
A
A
A
A
A
A
A
A
T
G
G
G
G
G
G
G
G
A
A
G
G
G
A
A
A
A
A
Structure
As mentioned above, the 3# UTR haplotypes were named
G
G
G
G
G
G
G
G
A
A
PROMO-G010101a
PROMO-G010101c
PROMO-G010101f
PROMO-G0104b
PROMO-G0103d
PROMO-G0104a
PROMO-G0103a
PROMO-G0103e
3076
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
Table 2. The 28 HLA-G Extended Haplotypes Separated into Promoter, 3#UTR and Coding Regions.
Extended Haplotype Frequency (2n 5 200) Promoter Haplotype Coding Haplotype 3# UTR Haplotype HLA-G Lineage
H01 0.210 PROMO-G010101a 01:01:01:01 UTR-1
H05 0.045 PROMO-G010101d 01:01:01:01 UTR-1 HG010101a
T, and þ3196 C/G. All the variation sites did present allele cording to a previous manuscript evaluating this same region
frequencies higher than 4.5%. The present haplotype analysis (Castelli et al. 2010). It should be emphasized that the HLA-G
revealed the same haplotypes as described in the previous 3# UTR and coding region sequence variation and haplotype
manuscript, named UTR-1 to UTR-8, with frequencies rang- structure were thoroughly evaluated in previously published
ing from 0.5% for UTR-8 to 25.5% for UTR-1 (table 2). It is manuscripts. Table 2 illustrates the relationship between the
interesting to note that 52% of the 3# UTR haplotypes were promoter, coding, and 3# UTR haplotypes.
represented by UTR-1 and UTR-2, both with a frequency of After the haplotypes from figure 3 were segmented, it could
26.0%. Both haplotypes differ in five of eight variation sites be noticed that a very concise pattern of LD and haplotype as-
found at the 3# UTR, a scenario that is reminiscent to the sociationwasobserved,witheachofthecodingregionhaplotype
one observed for the promoter region. The nucleotide diver- being associated with a specific promoter haplotype or haplo-
sity of the 3# UTR is the highest one among the HLA-G re- types that did not differ in more than one position. The
gions evaluated (0.01073), about 14.2 times the diversity of HG010101a sublineage is an extended haplotype with only
the entire human genome (Sachidanandam et al. 2001), with two variation sites in the positions 483 and þ3187. This sub-
a haplotype diversity of 0.818, higher than that observed for lineage is associated with a specific coding allele (G*01:01:01:01)
the promoter region. and a specific 3# UTR haplotype (UTR-1). UTR-1 is the only 3#
UTR haplotype carrying a guanine at the þ3187 position. Like-
Relationship Between Promoter, Coding, and 3# wise,theHG010101bsublineageisassociated with aspecificpro-
UTR Haplotypes moter and 3# UTR haplotype. The HG010101c sublineage
In order to investigate the relationship between the 5# URR, encompasses haplotypes associated with the same 3# UTR
coding, and 3# UTR haplotypes found in the present series, (UTR-4), coding region (G*01:01:01:05 and G*01:01:09, which
the extended haplotypes described in figure 3 were stratified is supposed to be derived from a G*01:01:01:05 allele), and pro-
into promoter, coding, and 3# UTR haplotypes. The pro- moter haplotypes that only differ by single positions. The
moter haplotypes were named according to the data pro- HG010102 lineage is composed of haplotypes that present
posed in table 1. The coding HLA-G haplotypes were the same promoter haplotype (except for one copy with a single
named according to the official HLA-G alleles described in exchange), two very similar 3# UTR (mainly UTR-2 and secondly
the IMGT database. The 3# UTR haplotypes were named ac- UTR-8, which differs only in the þ3010 position), and a group of
3077
Castelli et al. · doi:10.1093/molbev/msr138 MBE
20.9559, P 5 0.1276
20.6521, P 5 0.2694
21.4216, P 5 0.0095
20.1934, P 5 0.5218
alleles mainly represented by G*01:01:02:01, the null G*01:05N,
Ewens-Watterson
G*01:06, and the rare G*01:09. G*01:09 is supposed to be orig-
Normalized F
inated by a single mutation in the G*0106 allele because both
shared a thymine at the þ1790 position and the G*01:05N and
G*01:06allelesaresupposedtobederivedfromG*01:01:02:01by
single mutations. The same pattern is observed for the other
HLA-G lineages, in which a very close promoter and coding hap-
lotypes are observed, usually coding alleles that produce the
In the software input files, all indels were converted into SNPs to prevent loss of data because DNAsp do not support indels to calculate Tajima’s D and Fu and Li’s D and F.
over among frequent HLA-G extended haplotypes (table 2).
Tajima’s
ent seed values, obtaining the same results for 96 samples
D (p)
in all runs, and at least 10 runs indicated the same
haplotype for the remaining four samples carrying rare al-
leles such as G*01:09. Directional sequencing analysis was
also employed to verify some of the haplotype inferences Haplotype
in heterozygous individuals, confirming the PROMO- Diversity
0.816
0.869
0.818
0.906
G010101f and PROMO-G0103a haplotypes. Moreover,
several haplotypes were found in homozygosis in some in-
dividuals, including the PROMO-G010101a, PROMO-
of Haplotypes (k)
8
11
18
28
0.00692
0.00495
0.01073
0.00643
with rare HLA-G alleles). These six runs did not include
25
22
8
55
HLA-G 3# UTR
HLA-G coding
3078
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
coding region, and 3# UTR of the HLA-G locus (table 3). Only icans, Chinese, Danes (Ober et al. 2003; Hviid et al. 2004; Tan
the3#UTRpresentedastatisticallysignificant(P5 0.0095)neg- et al. 2005), and Brazilians, some inconsistencies regarding the
ative normalized F value (observed homozygosity lower than HLA-G promoter diversity and haplotypes were observed in
expected homozygosity); however, the 5# URR did present recent studies. Berger et al. have addressed the variability of
a negative normalized F value, exhibiting a low P-value (P 5 the HLA-G promoter region and compared the frequency
0.1276). Considering the HLA-G extended haplotypes, which of all the variation sites and haplotypes detected in European
encompasses the 5# URR, coding region, and 3# UTR, the American (EA) women with recurrent spontaneous abortion
Ewens–Watterson neutrality test did reveal a nonsignificant and healthy EA fertile women (Berger et al. 2010). Although
negative normalized F value higher than the values obtained three of seven frequent promoter haplotypes found in Berger’s
for each region separately (P 5 0.5218). The second neutrality study are shared by the Brazilian population, most of the hap-
test employed, Tajima’s D test, which takes into account se- lotypes with a frequency higher than 0.5% found in EA were
quence diversity, resulted in the observation that all estimated not detected among Brazilians. In Berger’s study, the haplo-
D values were positive, three of them reaching significance (ta- types were obtained using 12 variation sites and the three most
3079
Castelli et al. · doi:10.1093/molbev/msr138 MBE
HLA-G lineage (table 2). The HG010108 lineage do not follow pathological conditions. In addition, variation sites observed
the same pattern observed in the remaining lineages because in introns may be involved in HLA-G regulation processes,
it has a promoter of the HG010102 or HG010103 lineage and such as alternative splicing.
a 3# UTR of the HG0103 lineage, but a coding sequence that The HLA-G 5# URR is unique among the HLA genes
apparently do not have an origin in the HG010102 or (Solier et al. 2001). Due to the presence of a modified
HG010103 lineages. Curiously, the HG010102, HG010103, enhancer A (enhA) and a deleted interferon-stimulated
and HG010108 promoter is the most compatible with pri- response element (ISRE), the proximal HLA-G promoter
mate sequences (Tan et al. 2005). Moreover, a small sample is unresponsive to NF-jB (Gobin et al. 1998) and IFN-c
of 3# UTR sequence data from nonhuman primates (Castro (Gobin et al. 1999). Among the regulatory elements known
et al. 2000) revealed that Old World Monkeys and Great Apes to stimulate HLA-G, a 244-bp region located 1.2 kb from
present exclusively UTR-5 and UTR-3, respectively, which are exon 1 has been shown to be important for its spatiotem-
not so frequent in humans (Castelli et al. 2010). Given that poral expression in transgenic mice and was proposed to
the promoter and the 3# UTR haplotypes of this HG010108 have a locus control region (LCR) function (Schmidt et al.
3080
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
by modulating HLA-G mRNA stability (Hiby et al. 1999; polymorphisms present a high LD with promoter variation
O’Brien et al. 2001; Hviid et al. 2003; Rousseau et al. 2003). sites (fig. 3 and table 2), which indicates that, in an in vivo
Although the mechanisms implicated have not been eluci- scenario, given a proper microenvironment stimulus for
dated, HLA-G alleles presenting the 14-bp (5#-ATTTGTT- HLA-G expression, the rate of transcription, and translation
CATGCCT-3#) sequence (Harrison et al. 1993) have been may be influenced both by promoter and by 3# UTR poly-
associated with lower mRNA production for most membrane morphisms, that is, each variation site exerts its influence in
bound and soluble isoforms in trophoblast samples (Hviid a coordinated and dependent manner.
et al. 2003; Hviid 2006). On the other hand, a fraction of
HLA-G mRNA transcripts presenting the 14-base insertion Genetic Diversity and Evolutionary Aspects of
can be further processed (alternatively spliced) by the re- HLA-G Extended Haplotypes
moval of 92 bases from the mature HLA-G mRNA (Hiby Given the immune tolerance property of HLA-G and its benign
et al. 1999; Hviid et al. 2003), yielding smaller HLA-G tran- or harmful presence depending on the situation, the expres-
scripts, reported to be more stable than the complete mRNA sion of this gene must be under very tight control (Donadi
forms (Rousseau et al. 2003). Besides the 14-bp polymor- et al. 2011). It was previously shown that the pattern of var-
phism, two variation sites at the 3# UTR have been reported iation at the HLA-G promoter region is characterized by two
to influence HLA-G expression. The presence of an adenine at divergent lineages of promoter haplotypes that are maintained
the þ3187 position, which is 4-bp upstream of an AU-rich by balancing selection in worldwide human populations. These
motif, mediates mRNA degradation, leading to a decreased two divergent lineages may have different promoter activity
HLA-G expression (Yie et al. 2008). The presence of a guanine and might be involved in a fine balance between high-express-
at the þ3142 position may increase the affinity of the miR- ing and low-expressing HLA-G haplotypes (Tan et al. 2005).
148a, miR-148b, and miR-152 microRNAs for the HLA-G Our data do corroborate the presence of these two main
mRNA, therefore decreasing mRNA availability by mRNA HLA-G lineages, in which the extended HLA-G lineages
degradation and translation suppression (Tan et al. 2007). HG010102, HG010103, HG010108, and HG0104 (left part of
In addition to the þ3142 SNP, a recent in silico analysis re- fig. 2) correspond to the first one of Ober’s lineage, whereas
vealed that several human miRNAs have the potential to bind extended lineages HG010101 and HG0103 correspond to the
to the HLA-G mRNA 3# UTR and may influence HLA-G ex- second Ober’s lineage (table 1).
pression; however, the binding affinity might be influenced by Regarding the HLA-G locus as a whole, the two most
polymorphisms present in their target regions, emphasizing frequent lineages, that is, sublineage HG010101a (H01
the role of the 14-bp polymorphism and the þ3003, þ3010, and H05) and lineage HG010102 (H10 to H15), which
þ3027, and þ3035 SNPs (fig. 3), which encompass a region are equally frequent (about 26%) and account for more
of only 32 nucleotides (Castelli et al. 2009; Donadi et al. 2011). than 52% of the HLA-G haplotypes, are very different from
Alleles from these three major 3# UTR polymorphic sites each other. In fact, they differ in more than 50% of the 55
associated with HLA-G production are in strong LD with segregating sites analyzed, especially in the 5# URR and 3#
each other, illustrating a scenario in which their influence UTR, with 11 and 4 fixed differences, respectively (fig. 4).
may not be mutually exclusive (Castelli et al. 2010). It is Most of these nucleotide differences coincide with or
noteworthy that the 14-bp insertion is always accompanied are close to known or putative transcription factor–bind-
by the þ3142G and þ3187A alleles (fig. 3), both previously ing sites at the 5# URR or are even present at the 3# UTR
associated with low mRNA availability, indicating that the sites that have been reported to influence HLA-G mRNA
low mRNA production associated with the 14-bp insertion availability (14-bp polymorphism, þ3142 and þ3187
(Hviid et al. 2003) may also be a consequence of the pres- SNPS). Additionally, both lineages do present several differ-
ence of these polymorphisms associated with the 14-bp ences in the coding region, but the same G*01:01 HLA-G
polymorphism (Castelli et al. 2010). Besides, these three protein is encoded in approximately 79.8% of cases.
3081
Castelli et al. · doi:10.1093/molbev/msr138 MBE
The coding region suffers a strong selective pressure for sentation appear to have experienced purifying selection
invariance (purifying selection), that is, preservation of nu- (Hughes and Nei 1989; Archie et al. 2010). Outside the
cleotide and amino acid sequences, reduced variability, and MHC, there is strong evidence that balancing selection
lower than expected nonsynonymous mutation rate has shaped the pattern of variation of the 5# URR region
(Castro et al. 2000). In fact, strong evidence of purifying of the CCR5 gene (Bamshad et al. 2002; Ramalho et al.
selection at the coding region was disclosed by the perfor- 2010), whereas its coding region has been subject to pos-
mance of a synonymous and nonsynonymous nucleotide itive selection or neutral evolution (Sabeti et al. 2005).
substitution test considering all HLA-G alleles available In order to better evaluate which polymorphic sites are in
at the IMGT database, which revealed an excess of synon- fact driven by these balancing selection signatures, windows
ymous changes in all exons (1—5), which is consistent with of 150-bp in the promoter, coding, and 3# UTR regions were
purifying selection (Mendes-Junior CT et al., unpublished established and the Tajima’s D were calculated in each one of
data). Because HLA-G presents several biological effects re- them. The promoter region did present three windows with
lated to the control of immune response (Lee et al. 1995; significantly positive Tajima’s D values: one window with the
3082
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
Watterson, Tajima’s D and Fu and Li’s F and D), which sites and differences in the 3# UTR mainly at positions that
revealed evidences of balancing selection acting only on have already been reported to influence HLA-G mRNA sta-
the regulatory regions (5# URR and 3# UTR) and on the bility and degradation rate. The evidence of balancing se-
HLA-G locus as whole, probably by a hitchhiking effect lection acting on the regulatory regions indicates that these
due to the regulatory regions surrounding the coding re- HLA-G lineages are probably related to different expression
gion. These results may be, however, obscured by the pop- profiles, depending on microenvironmental factors and on
ulation history (admixture) that characterizes this Brazilian physiological or pathological conditions of the individual.
urban population. Population stratification, for example,
may add to selection, overestimating the balancing selec- Acknowledgments
tion signatures obtained here. However, these same eviden- This study was supported by the Brazilian National Re-
ces have been found in other populations, such as Han search Council (CNPq/Brazil—Grants 475670/2007-8 and
Chinese, African American, and European American 558476/2008-0) and the binational collaborative research
(Tan et al. 2005). Nevertheless, analyses of other worldwide program CAPES-COFECUB (project # 653/09). E.C.C was
3083
Castelli et al. · doi:10.1093/molbev/msr138 MBE
Carosella ED, Horuzsko A. 2007. HLA-G and cancer. Semin Cancer Biol. binding protein (CREB). An alternative transactivation pathway
17:411–412. to the conserved major histocompatibility complex (MHC) class
Carosella ED, Moreau P, Lemaoult J, Rouas-Freiss N. 2008. HLA-G: I regulatory routes. J Biol Chem. 277:39525–39531.
from biology to clinical benefits. Trends Immunol. 29:125–132. Gobin SJ, Keijsers V, van Zutphen M, van den Elsen PJ. 1998. The
Castelli EC, Mendes-Junior CT, Deghaide NH, de Albuquerque RS, role of enhancer A in the locus-specific transactivation of
Muniz YC, Simoes RT, Carosella ED, Moreau P, Donadi EA. 2010. classical and nonclassical HLA class I genes by nuclear factor
The genetic structure of 3’untranslated region of the HLA-G gene: kappa B. J Immunol. 161:2276–2283.
polymorphisms and haplotypes. Genes Immun. 11:134–141. Gobin SJ, van Zutphen M, Woltman AM, van den Elsen PJ. 1999.
Castelli EC, Mendes-Junior CT, Donadi EA. 2007. HLA-G alleles and Transactivation of classical and nonclassical HLA class I genes
HLA-G 14 bp polymorphisms in a Brazilian population. Tissue through the IFN-stimulated response element. J Immunol.
Antigens. 70:62–68. 163:1428–1434.
Castelli EC, Mendes-Junior CT, Viana de Camargo JL, Donadi EA. 2008. Guo SW, Thompson EA. 1992. Performing the exact test of Hardy-
HLA-G polymorphism and transitional cell carcinoma of the Weinberg proportion for multiple alleles. Biometrics 48:361–372.
bladder in a Brazilian population. Tissue Antigens. 72:149–157. Harrison GA, Humphrey KE, Jakobsen IB, Cooper DW. 1993. A 14 bp
Castelli EC, Mendes-Junior CT, Wiezel CE, Peres NT, Simoes AL,
deletion polymorphism in the HLA-G gene. Hum Mol Genet.
3084
HLA-G Variation in Regulatory Regions · doi:10.1093/molbev/msr138 MBE
Lancaster AK, Single RM, Solberg OD, Nelson MP, Thomson G. 2007. Pirri A, Contieri FC, Benvenutti R, Bicalho Mda G. 2009. A study of
PyPop update-a software pipeline for large-scale multilocus HLA-G polymorphism and linkage disequilibrium in renal trans-
population genomics. Tissue Antigens: 69(Suppl 1): 192–197. plant patients and their donors. Transpl Immunol. 20:143–149.
Le Discorde M, Moreau P, Sabatier P, Legeais JM, Carosella ED. 2003. Ponte M, Cantoni C, Biassoni R, Tradori-Cappai A, Bentivoglio G, Vitale C,
Expression of HLA-G in human cornea, an immune-privileged Bertone S, Moretta A, Moretta L, Mingari MC. 1999. Inhibitory
tissue. Hum Immunol. 64:1039–1044. receptors sensing HLA-G1 molecules in pregnancy: decidua-associated
Le Gal FA, Riteau B, Sedlik C, Khalil-Daher I, Menier C, Dausset J, natural killer cells express LIR-1 and CD94/NKG2A and acquire p49,
Guillet JG, Carosella ED, Rouas-Freiss N. 1999. HLA-G-mediated an HLA-G1-specific receptor. Proc Natl Acad Sci U S A. 96:5674–5679.
inhibition of antigen-specific cytotoxic T lymphocytes. Int Pyo CW, Williams LM, Moore Y, Hyodo H, Li SS, Zhao LP,
Immunol. 11:1351–1356. Sageshima N, Ishitani A, Geraghty DE. 2006. HLA-E, HLA-F, and
Lee N, Malacko AR, Ishitani A, Chen MC, Bajorath J, Marquardt H, HLA-G polymorphism: genomic sequence defines haplotype
Geraghty DE. 1995. The membrane-bound and soluble forms of structure and variation spanning the nonclassical class I genes.
HLA-G bind identical sets of endogenous peptides but differ Immunogenetics 58:241–251.
with respect to TAP association. Immunity. 3:591–600. Rajagopalan S, Long EO. 1999. A human histocompatibility
Lefebvre S, Berrih-Aknin S, Adrian F, Moreau P, Poea S, Gourand L, leukocyte antigen (HLA)-G-specific receptor expressed on all
3085
Castelli et al. · doi:10.1093/molbev/msr138 MBE
Veit TD, Chies JA. 2009. Tolerance versus immune Yie SM, Li LH, Xiao R, Librach CL. 2008. A single base-pair mutation
response—microRNAs as important elements in the regulation in the 3’-untranslated region of HLA-G mRNA is associated with
of the HLA-G gene expression. Transpl Immunol. 20:229–231. pre-eclampsia. Mol Hum Reprod. 14:649–653.
Yelavarthi KK, Schmidt CM, Ehlenfeldt RG, Orr HT, Hunt JS. 1993. Yie SM, Xiao R, Librach CL. 2006. Progesterone regulates HLA-G
Cellular distribution of HLA-G mRNA in transgenic mouse gene expression through a novel progesterone response
placentas. J Immunol. 151:3638–3645. element. Hum Reprod. 21:2538–2544.
3086