© 2002 American Thoracic Society
DNA Sequence Variants in Epithelium-Specific ETS-2 and ETS-3 Are Not Associated with AsthmaDivision of Pulmonary and Critical Care Medicine; Channing Laboratory, Department of Medicine, Brigham and Women's Hospital; New England Baptist Bone and Joint Institute, Beth Israel Deaconess Medical Center, Harvard Medical School; Department of Environmental Health, Harvard School of Public Health, Boston; Whitehead Genome Center, Massachusetts Institute of Technology, Cambridge; Thermal and Mountain Medicine Division, U.S. Army Research Institute of Environmental Medicine, Natick, Massachusetts; and Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio Correspondence and requests for reprints should be addressed to Eric S. Silverman, M.D., Department of Environmental Health, Harvard School of Public Health, 667 Huntington Avenue, Boston, MA 02115. E-mail: esilverm{at}hsph.harvard.edu
Epithelium-specific ETS-2 and ETS-3 are transcription factors that have been proposed as asthma candidate genes. To investigate the association of sequence variants in these genes with asthma, we conducted a casecontrol association analysis in a sample of 311 white subjects with asthma and 177 white subjects without asthma. Common polymorphisms in these genes were detected by sequencing DNA from 32 cell lines obtained from Coriel (Camden, NJ). Seven noncoding or synonymous single-nucleotide polymorphisms were detected: three in epithelium-specific ETS-2 and four in epithelium-specific ETS-3. Subjects were genotyped at all loci by mass spectroscopy. To ensure the suitability of our control subjects, we also genotyped subjects at 49 unlinked polymorphisms evenly distributed throughout the autosomes and found no evidence of population stratification. Logistic regression adjusted for age and sex suggested a weak association of one epithelium-specific ETS-2 polymorphism with asthma diagnosis (odds ratio = 1.89, 95% confidence interval = 1.133.18, p = 0.02). Total serum immunoglobulin E and FEV1 predicted levels were not associated with any of the polymorphisms. Extended haplotyping indicated linkage disequilibrium in these genes; however, no association or epistatic interaction was found. This study suggests that epithelium-specific ETS-2 and ETS-3 genes are unlikely to contain polymorphic loci that have a major impact on asthma susceptibility in our population.
Key Words: genetics population stratification polymorphism Tristan da Cunha transcription factor
Epithelium-specific ETS-2 (ESE-2/ELF-5/ASTH-I; GenBank accession numbers AF115402, AF115403) and epithelium-specific ETS-3 (ESE-3/EHF/ASTH-J; GenBank accession numbers AF124438, AF124439) are members of an epithelium-specific ETS transcription factor subfamily (15) that have been proposed as candidate genes for asthma (6, 7). Members of this subfamily are characterized by their epithelial cellrestricted expression profile and structural homology in the ETS DNA-binding and pointed domains (3, 4). ESE-3 is constitutively expressed in the lung, with relatively high levels in bronchial columnar and mucous gland epithelial cells (4). ESE-2 is expressed at much lower levels in the lung, presumably in epithelial cells (1). The two genes are approximately 107 kb apart, transcribe in opposite directions, are alternatively spliced, and are located on chromosome 11p, a genomic region linked to airway hyper-responsiveness and immunoglobulin (Ig) E levels in a number of genome scans (8). The roles of these transcription factors in lung biology are unknown but may involve (1) tubulogenesis and branching morphogenesis in glandular organs such as the lung or trachea, (2) induction or repression of epithelium-specific genes in the context of an inflammatory microenvironment, and (3) oncogenesis of epithelium-derived tumors (1, 4).
ESE-2 and ESE-3 were identified as candidate genes for asthma in a genome-wide linkage analysis performed in the population from Tristan da Cunha (6, 9, 10). Tristan da Cunha is a small volcanic island in the South Atlantic Ocean, whose population was chosen as a sampling frame for a genome screen study of asthma due to their inbred nature, their relatively homogeneous environment, and the high (
Study Populations Case DNA was obtained from 311 white patients with asthma. These patients were unrelated individuals originally recruited for an asthma medication trial conducted in the United States. To qualify for inclusion, patients had to be nonsmokers aged 18 to 45 years, have no significant comorbid medical conditions, and have diagnostic findings consistent with moderate to severe asthma according to the American Thoracic Society criteria (11). The only medications used by the patients were inhaled ß-agonists, as needed. Patients were required to have an FEV1 of 40 to 85% of the predicted normal values after at least 8 hours without inhaling ß-agonists; oral or inhaled corticosteroids were excluded for 6 weeks before the study. Reversibility of airflow obstruction by ß-agonists (15% change required) or methacholine-sensitivity testing was employed to confirm asthma diagnosis. Control DNA from 177 white subjects without asthma was obtained from the Environmental Medicine Genome Bank (12). The Environmental Medicine Genome Bank consists of U.S. army recruits from across the country undergoing basic training. Applicants are excluded from enlistment in the army if they have symptoms of asthma, use medications for asthma regardless of symptoms, or have been given a reliable diagnosis of obstructive airway disease at any age. Subjects were assessed for age, sex, history of asthma, or exercise-induced bronchospasm, spirometry, and total serum IgE. Both populations are described in Table 1 . All DNA was purified from peripheral blood by standard techniques after subjects provided written informed consent.
Sequencing and Genotyping Polymorphisms were discovered by direct sequencing of polymerase chain reaction amplicons from DNA samples as described previously (13). This DNA was obtained from 32 immortalized cell lines (Coriel, Camden, NJ) that represent individuals with a broad racial profile. Polymerase chain reaction primers were designed to amplify ESE-2 and ESE-3 exons, exonintron boundaries, 5'- and 3'-untranslated regions, and promoter sequences. All polymerase chain reaction primers, reaction conditions, and polymorphism flanking sequences are available in the online data supplement (Tables E1 and E2). For genotyping, we used single-base extension followed by MALDI-TOF mass spectroscopy (Sequenom, San Diego, CA) as described previously (14).
Statistical Analysis IgE levels were log-transformed to approximate a normal distribution. Sex and casecontrol statuses were analyzed as dichotomous variables; all other variables were analyzed as continuous. For the analysis of the association of haplotypes with continuous outcomes in the case group, the FEV1% predicted and age- and sex-adjusted serum total IgE levels were dichotomized within the case group into "high" (upper quartile of frequency distribution) and "low" (lower quartile of frequency distribution) groups. Imputed haplotype frequencies were compared between these groups. HardyWeinberg equilibrium was tested at each single-nucleotide polymorphism (SNP) locus in a contingency table of observed versus predicted genotype frequencies, using a modified Markov-chain random walk algorithm (15). Logistic regression (16) was used to model the effects of multiple covariates and genotype on casecontrol status, including an investigation of the need for interaction of polynomial terms. The biallelic polymorphisms were analyzed under (1) a dominant genetic model, with the most common homozygote as the baseline, versus the combination of the heterozygote and the less common homozygotethis model estimated risk of asthma in the carriers of the sequence variants investigated; and (2) a log-additive model coded into three classes, with the most common homozygote as the baseline that estimated risk of asthma across the three genotypes.
Maximum likelihood haplotype frequencies were imputed in cases and controls with an expectationmaximization approach (17), as implemented in the program Arlequin version 2.0. The likelihood of the observed data, D, given the haplotype frequencies, p, under this approach is
j or Gij = pi2 if i = j. The expectationmaximization algorithm was repeated from 20 different starting points. Standard deviations were estimated by a parametric bootstrap procedure. Haplotype frequencies were compared by a likelihood ratio test of the 2 x k contingency table (k = number of haplotypes with frequency 5% in either comparison group), whose empirical distribution was obtained by a permutation procedure using a modified Markov-chain random walk algorithm (18). Pairwise linkage disequilibrium between each pair of SNP loci was analyzed by a likelihood ratio test, whose empirical distribution was obtained by a permutation procedure (19). Lewontin's disequilibrium coefficient D' was estimated from imputed haplotype frequencies (20). S-Plus 2000 (Mathsoft Inc., Cambridge, MA), Sib-Pair v0.99.9 (http://www2.qimr.edu.au/davidD/), Arlequin v2.0 (http://anthro.unige.ch/ arlequin), and ldmax (http://www.well.ox.ac.uk/asthma/GOLD/) were used to manage and analyze the data. p Values were derived by empirical simulation where possible. Statistical significance was defined at the standard 5% level.
Stratification Analysis
SNP Discovery Three SNPs were identified in ESE-2, and four SNPs were identified in ESE-3. These SNPs are shown with their relative locations in Figure 1 . SNPs are numbered relative to the ATG start site (A = 1). All SNPs in ESE-2 are located in introns. Two ESE-3 SNPs are located in introns and two in protein coding regions, but they are synonymous.
Stratification Analysis The analysis of the 49 unlinked SNPs suggested no evidence of significant population substructure within the casecontrol study population ( 249 = 47.9, p = 0.48). A close concordance was found between expected and observed chi-square values across the 49 SNPs (Figure 2)
.
Association Study General characteristics. The study population characteristics are described in Table 1. The sex ratio was significantly different between cases and controls ( 21 = 12.5, p < 0.001): cases = 45.5% male (n = 141), and controls = 62.2% male (n = 110). Genotype and allele frequencies of the seven SNPs investigated are given in Tables 2 and 3 . The distribution of genotypes for all the SNPs was consistent (p > 0.05) with HardyWeinberg equilibrium, with the exception of the E2-3 SNP (empirical p = 0.01), where the proportion of heterozygotes was slightly less than expected.
The SNPs within each gene were in close linkage disequilibrium; however, disequilibrium did not extend across the two genes. Table 4 shows the disequilibrium coefficients, p values, and the number of genotypes available for calculations.
Single SNP association analysis. Bivariate analysis indicated that, within the ESE-2 gene, the E2-1 ( 22 = 2.58, p = 0.28), E2-2 ( 22 = 0.48, p = 0.79), and E2-3 ( 22 = 1.08, p = 0.58) SNPs were not associated with casecontrol status. Similarly, within the ESE-3 gene, the E3-1 ( 22 = 5.50, p = 0.06), E3-2 ( 22 = 1.32, p = 0.52), E3-3 ( 22 = 1.12, p = 0.58), and E3-4 ( 22 = 0.17, p = 0.92) SNPs were not associated with casecontrol status. Multivariate modeling including age and sex generally confirmed the lack of association of the ESE-2 and ESE-3 SNPs with asthma. However, the multivariate modeling did suggest a significant association between the E2-1 SNP and asthma (dominant model: odds ratio = 1.84, 95% confidence interval = 1.103.07, p = 0.02). Bivariate analysis within the case group did not suggest any significant associations between total serum IgE levels and E3-1 (F2,140 = 0.18, p = 0.83), E3-2 (F2,133 = 0.45, p = 0.64), E3-3 (F2,131 = 0.24, p = 0.79), E3-4 (F2,138 = 1.75, p = 0.18), E2-1 (F2,135 = 1.89, p = 0.16), E2-2 (F2,137 = 1.95, p = 0.15), or E2-3 (F2,135 = 0.51, p = 0.60). Similarly, bivariate analysis did not suggest any significant association between FEV1% predicted and E3-1 (F2,308 = 2.00, p = 0.12), E3-2 (F2,296 = 0.03, p = 0.97), E3-3 (F2,292 = 0.00, p = 1.00), E3-4 (F2,305 = 0.04, p = 0.96), E2-1 (F2,301 = 0.27, p = 0.76), E2-2 (F2,301 = 1.11, p = 0.33), or E2-3 (F2,294 = 1.34, p = 0.26). Multivariate analysis confirmed the lack of significant associations (data not shown). Haplotype association analysis. The imputed haplotype frequencies in the cases and controls are given in Table 5 . An exact test of population differentiation did not suggest any significant difference in imputed haplotype frequency between cases and controls (exact p = 0.47 ± 0.11). Imputed haplotype frequencies were also estimated within each of the two genes; these frequencies also did not significantly differ between cases and controls (data not shown). Within the cases, comparison of haplotype frequencies between the high and low total serum IgE groups or the high and low FEV1% predicted groups did not indicate any significant difference in haplotype frequencies (data not shown).
Our study was designed to discover common genetic variants in ESE-2 and ESE-3 and then to use these variants to conduct a large casecontrol association study to identify asthma susceptibility loci in this genomic region. We found a total of seven novel SNPs, three in ESE-2 and four in ESE-3. Two SNPs are located in coding regions but are synonymous, and five SNPs are located in noncoding regions. These SNPs are common in our population, with minor allele frequencies ranging from 47.4 to 9.4%. All but one SNP was in HardyWeinberg equilibrium, a result not surprising, given the multiple comparisons necessitated by studying seven SNPs. We found extensive linkage disequilibrium among the SNPs in each gene ( 40 kb in length); however, linkage disequilibrium did not extend across both genes, which are separated by approximately 107 kb of DNA. There are six haplotypes with imputed frequencies ranging from 5.6 to 19.5% in the control subjects. An analysis of individual SNPs revealed a weak association between E2-1 and the diagnosis of asthma; there was no significant association with IgE level or FEV1. No significant associations were found between imputed haplotype frequencies and asthma diagnosis, total IgE level, or FEV1% predicted. A post hoc power calculation suggested that, assuming an value of 0.05 under a dominant model, our study had 81.8% power to detect a true odds ratio of 1.89 between cases and controls for the E2-1 polymorphisms. However, appropriate correction for multiple comparisons would make the observed association between E2-1 and asthma not statistically significant, i.e., likely to have occurred by chance alone. Thus, our study suggests overall that ESE-2 and ESE-3 are unlikely to contain a major locus modulating asthma risk in our outbred, EuropeanAmerican study population. This study fulfills the major criteria for an association study (22, 23) and has the following five strengths: (1) ESE-2 and ESE-3 are good candidate genes for asthma on the basis of their biology. (2) These genes are located in a genomic region linked to the diagnosis of asthma in genome-wide scans involving the inbred population of Tristan da Cunha (6, 7, 10). (3) This is a large association study involving almost 500 individuals genotyped at seven loci (+ 49 for stratification analysis). (4) Cases with asthma and control subjects without asthma have been tested for the potential confounding factor of population stratification and have been determined to be well matched in this regard. (5) Linkage disequilibrium estimations indicate strong linkage disequilibrium within each gene, and a multilocus methodology was used to investigate potential associations with haplotypes. ESE-2 and ESE-3 may play a role in lung development and the pathogenesis of airway disease and are excellent candidate genes for an asthma association study. First, both genes are limited in their tissue expression profile, yet are highly expressed in epithelial cells, such as those in the airway mucous glands and the bronchial columnar lining. Second, both genes are members of a recently described subfamily of ETS transcription factors and may function as transcriptional activators or repressors of genes expressed in epithelial cells. It has been hypothesized that they play a role in tubulogenesis and branching morphogenesis in glandular organs and in oncogenesis of epithelium-derived tumors (4). Finally, ESE-3 can be induced by inflammatory cytokines such as those found in the airways of patients with asthma and may be expressed in airway smooth muscle cells as well as in epithelial cells under these conditions (Silverman and coworkers, unpublished data). Little is known about the mechanisms regulating ESE-2 and ESE-3 expression, and it is conceivable that they are affected by the SNPs described herein. For example, SNP E3-3 is located in intron-7 and alters a consensus branch site that could play a role in RNA splicing. Nevertheless, we acknowledge that no SNPs altered amino acid sequence, and any functional implications are speculative. Genetic studies have also implicated ESE-2 and ESE-3 as candidate genes for asthma. Genome-wide linkage analysis was performed in the Tristan da Cunha population, an inbred population with a 30% prevalence of asthma due to a strong founder effect and geographic isolation, and strong linkage with the microsatellite marker D11S07 on chromosome 11p was found (6, 10). With additional markers, more families with affected members, and the use of transmission disequilibrium testing, it was possible to narrow this region to a physical map containing 300 to 500 kb (10). ESE-2 and ESE-3 are located in this genomic region; however, to our knowledge, no polymorphisms have been found in this region and successfully associated with asthma in this or any other population. Because single SNPs are potentially less informative than are multilocus haplotypes, we imputed haplotypes and used them in our analysis. Nevertheless, no association was detected between haplotype and diagnosis of asthma, total serum IgE level, or FEV1% predicted. We used the technique of Pritchard and Rosenberg (21) to detect the potentially confounding factor of population stratification in our outbred cases and controls. To our knowledge, this is the first asthma genetic association study to use this technique (24, 25). The original simulation study of Pritchard and Rosenberg used microsatellite markers and recommended typing about 15 to 20 unlinked microsatellite markers to detect population stratification, based on their simulations. However, microsatellite markers provide more power to detect stratification than do biallelic markers or SNPs. The optimal number of SNPs for detection of stratification is unknown; however, it has been suggested that approximately 20 provide sufficient power (p = 0.05) to detect important stratification (21). Thus, the 49 unlinked loci for which we have data should have provided sufficient power to detect stratification between our cases and controls.
The main limitation to this study involved the large region ( In conclusion, this casecontrol association study suggests that ESE-2 and ESE-3 do not contain a major susceptibility locus for asthma in our population. Replication in different populations will be important, especially in light of previous reports involving founder populations with a high prevalence of asthma. Further functional characterization of these genes is necessary to clarify the significance of ESE-2 and ESE-3 in lung biology and their potential roles in the pathogenesis of asthma.
The authors thank Drs. Matthew Freedman, David Altshuler, Antonio Tugores, and Deborah Markowitz for sharing data and their insightful comments.
Supported by grants NIH UO1 HL65899 and RO1 HL70573. The views, opinions, and/or findings contained in this publication are those of the authors and should not be construed as an official Department of the Army position, policy, or decision unless so designated by other documentation. Citations of commercial organizations and trade names in this report do not constitute an official Department of the Army endorsement or approval of the products or services of these organizations. This article has an online data supplement, which is accessible from this issue's table of contents online at www.atsjournals.org Received in original form January 28, 2002; accepted in final form June 27, 2002
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||