GWAS Identifies Risk Locus for Erectile Dysfunction and Implicates Hypothalamic Neurobiology and Diabetes in Etiology

Erectile dysfunction (ED) is a common condition affecting more than 20% of men over 60 years, yet little is known about its genetic architecture. We performed a genome-wide association study of ED in 6,175 case subjects among 223,805 European men and identified one locus at 6q16.3 (lead variant rs57989773, OR 1.20 per C-allele; p = 5.71 × 10−14), located between MCHR2 and SIM1. In silico analysis suggests SIM1 to confer ED risk through hypothalamic dysregulation. Mendelian randomization provides evidence that genetic risk of type 2 diabetes mellitus is a cause of ED (OR 1.11 per 1-log unit higher risk of type 2 diabetes). These findings provide insights into the biological underpinnings and the causes of ED and may help prioritize the development of future therapies for this common disorder.

Main Text

Erectile dysfunction (ED) is the inability to develop or maintain a penile erection adequate for sexual intercourse.

ED has an age-dependent prevalence, with 20%–40% of men aged 60–69 years affected.

The genetic architecture of ED remains poorly understood, owing in part to a paucity of well-powered genetic association studies. Discovery of such genetic associations can be valuable for elucidating the etiology of ED and can provide genetic support for potential new therapies.

We conducted a genome-wide association study (GWAS) in the population-based UK Biobank (UKBB) and the Estonian Genome Center of the University of Tartu (EGCUT) cohorts and hospital-recruited Partners HealthCare Biobank (PHB) cohort. Subjects in UKBB were of self-reported white ethnicity, with subjects in EGCUT and PHB of European ancestry, as per principal components analyses (Supplemental Material and Methods).
ED was defined as self-reported or physician-reported ED using ICD10 codes N48.4 and F52.2, or use of oral ED medication (sildenafil/Viagra, tadalafil/Cialis, or vardenafil/Levitra), or a history of surgical intervention for ED (using OPCS-4 codes L97.1 and N32.6) (Supplemental Material and Methods). The prevalence of ED in the cohorts was 1.53% (3,050/199,352) in UKBB, 7.04% (1,182/16,787) in EGCUT, and 25.35% (1,943/7,666) in PHB (Table S1). Demographic characteristics of the subjects in each cohort are shown in Table S2. The reasons for the different prevalence rates in the three cohorts may include a higher median cohort age for men in PHB (65 years, compared to 59 years in UKBB and 42 years in EGCUT; Table S2), “healthy volunteer” selection bias in UKBB,

a lack of primary care data availability in UKBB, and intercultural differences, including “social desirability” bias.

Importantly, we note that the assessment of exposure-outcome relationships remains valid, despite the prevalence likely not being representative of the general population prevalence.

GWASs in UKBB revealed a single genome-wide significant (p < 5 × 10−8) locus at 6q16.3 (lead variant rs57989773, EAFUKBB [C-allele] = 0.24; OR 1.23; p = 3.0 × 10−11). Meta-analysis with estimates from PHB (OR 1.20; p = 9.84 × 10−5) and EGCUT (OR 1.08; p = 0.16) yielded a pooled meta-analysis OR 1.20; p = 5.71 × 10−14 (heterogeneity p value = 0.17; Figures 1A–1C). Meta-analysis of all variants yielded no further genome-wide loci. Meta-analysis of our results with previously suggested ED-associated variants also did not result in any further significant loci (Supplemental Material and Methods; Table S3), nor did X chromosome analysis in UKBB.

Figure thumbnail gr1
Figure 16q16.3 (Lead Variant rs57989773) Is an Erectile Dysfunction-Associated Locus and Exhibits Pleiotropic Phenotypic Effects


The association of rs57989773 was consistent across clinically and therapy defined ED, as well as across different ED drug classes (Figures 1C and S1). No further genome-wide significant loci were identified for ED when limited to clinically or therapy defined case subjects (2,032 and 4,142 case subjects, respectively).
A PheWAS of 105 predefined traits (Table S4) using the lead ED SNP rs57989773 found associations with 12 phenotypes at a p value < 5 × 10−4 (surpassing the Bonferroni-corrected threshold of 0.05/105), including adiposity (nine traits), adult height, and sleep-related traits. Sex-stratified analyses revealed sexual dimorphism for waist-hip ratio (WHR; unadjusted and adjusted for body mass index) and systolic and diastolic blood pressure (Figure 1D; Table S5).
The lead variant at the 6q16.3 locus, rs57989773, lies in the intergenic region between MCHR2 and SIM1, with MCHR2 being the closest gene (distances to transcription start sites of 187 kb for MCHR2 and 284 kb for SIM1). Conditional and joint analysis (Supplemental Material and Methods) revealed no secondary, independent signals in the locus. Previous work has implicated the MCHR2-SIM1 locus in sex-specific associations on age at voice-breaking and menarche.

The puberty timing-associated SNP in the MCHR2-SIM1 region (rs9321659; ∼500 kb from rs57989773) was not in LD with our lead variant (r2 = 0.003, D’ = 0.095) and was not associated with ED (p = 0.32) in our meta-analysis, suggesting that the ED locus represents an independent signal.

To identify the tissue and cell types in which the causal variant(s) for ED may function, we examined chromatin states across 127 cell types

for the lead variant rs57989773 and its proxies (r2 > 0.8, determined using HaploReg v.4.1) (Supplemental Material and Methods). Enhancer marks in several tissues, including embryonic stem cells, mesenchymal stem cells, and endothelial cells, indicated that the ED-associated interval lies within a regulatory locus (Figure 2A; Table S6).

Figure thumbnail gr2
Figure 2Functional Analysis of 6q16.3 Implicates SIM1 in ED Pathogenesis


To predict putative targets and causal transcripts, we assessed domains of long-range three-dimensional chromatin interactions surrounding the ED-associated interval (Figure 2B). Chromosome conformation capture (Hi-C) in human embryonic stem cells

showed that MCHR2 and SIM1 were in the same topologically associated domain (TAD) as the ED-associated variants, with high contact probabilities (referring to the relative number of times that reads in two 40-kb bins were sequenced together) between the ED-associated interval and SIM1 (Figures 2B and S2). This observation was further confirmed in endothelial precursor cells,

where Capture Hi-C revealed strong connections between the MCHR2-SIM1 intergenic region and the SIM1 promoter (Figure 2C), pointing toward SIM1 as a likely causal gene at this locus.

We next used the VISTA enhancer browser

to examine in vivo expression data for non-coding elements within the MCHR2-SIM1 locus. A regulatory human element (hs576), located 30-kb downstream of the ED-associated interval, seems to drive in vivo enhancer activity specifically in the midbrain (mesencephalon) and cranial nerve in mouse embryos (Figure 2D). This long-range enhancer close to ED-associated variants recapitulated aspects of SIM1 expression (Figure 2D), further suggesting that the ED-associated interval belongs to the regulatory landscape of SIM1. Taken together these data suggest that the MCHR2-SIM1 intergenic region harbors a neuronal enhancer and that SIM1 is functionally connected to the ED-associated region.

Single-minded homolog 1 (SIM1) encodes a transcription factor that is highly expressed in hypothalamic neurons.

Rare variants in SIM1 have been linked to a phenotype of severe obesity and autonomic dysfunction,

including lower blood pressure. A summary of the variant-phenotype associations at the 6q16 locus in human and rodent models is shown in Table S7. Post hoc analysis of association of rs57989773 with autonomic traits showed nominal association with syncope, orthostatic hypotension, and urinary incontinence (Figure S3). The effects on blood pressure and adiposity seen in individuals with rare coding variants in SIM1 are recapitulated in individuals harboring the common ED-risk variants at the 6q16.3 locus (Figure 1D), suggesting that SIM1 is the causal gene at the ED-risk locus. SIM1-expressing neurons also play an important role in the central regulation of male sexual behavior as mice that lack the melanocortin receptor 4 (encoded by MC4R) specifically in SIM1-expressing neurons show impaired sexual performance on mounting, intromission, and ejaculation.

Thus, hypothalamic dysregulation of SIM1 could present a potential mechanism for the effect of the MCHR2-SIM1 locus on ED.

An alternative functional mechanism may be explained by proximity of the lead variant (rs57989773) to an arginase 2 processed pseudogene (LOC100129854), a long non-coding RNA (Figure 2A). RPISeq

predicts that the pseudogene transcript would interact with the ARG2 protein, with probabilities of 0.70–0.77. Arginine 2 is involved in nitric oxide production and has a previously established role in erectile dysfunction.

GTEx expression data

demonstrated highest mean expression in adipose tissue, with detectable levels in testis, fibroblasts, and brain. Expression was relatively low in all tissues, however, and there was no evidence that any SNPs associated with the top ED signal were eQTLs for the ARG2 pseudogene or ARG2 itself.

As a complementary approach, we also used the Data-driven Expression Prioritized Integration for Complex Traits and GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction (DEPICT and GARFIELD, respectively; Supplemental Material and Methods)

tools to identify gene-set, tissue-type, and functional enrichments. In DEPICT, the top two prioritized gene-sets were “regulation of cellular component size” and “regulation of protein polymerization,” whereas the top two associated tissue/cell types were “cartilage” and “mesenchymal stem cells.” None of the DEPICT enrichments reached an FDR threshold of 5% (Tables S8–S10). GARFIELD analyses, which assesses enrichment of GWAS signals in regulatory or functional regions in different cell types, also did not yield any statistically significant enrichments, therefore limiting the utility of these approaches in this case.

ED is recognized to be observationally associated with various cardiometabolic traits and lifestyle factors,

including type 2 diabetes mellitus (T2D), hypertension, and smoking. To further evaluate these associations, we first conducted LD score regression

to evaluate the genetic correlation of ED with a range of traits. LD score regression identified ED to share the greatest genetic correlation with T2D, limb fat mass, and whole-body fat mass (FDR-adjusted p values < 0.05; Table S11).

Next we performed Mendelian randomization

(MR) analyses to evaluate the potential causal role of nine pre-defined cardiometabolic traits on ED risk (selected based on previous observational evidence linking such traits to ED risk

), i.e., T2D, insulin resistance, systolic blood pressure, LDL cholesterol, smoking heaviness, alcohol consumption, body mass index, coronary heart disease, and educational attainment (Tables S12–S15). MR identified genetic risk to T2D to be causally implicated in ED: each 1-log higher genetic risk of T2D was found to increase risk of ED with an OR of 1.11 (95% CI 1.05–1.17, p = 3.5 × 10−4, which met our a priori Bonferroni-corrected significance threshold of 0.0056 [0.05/9]), with insulin resistance likely representing a mediating pathway

(OR 1.36 per 1 standard deviation genetically elevated insulin resistance, 95% CI 1.01–1.84, p = 0.042). Sensitivity analyses were conducted to evaluate the robustness of the T2D-ED estimate (Figure S5, Table S13), including weighted median analyses (OR 1.12, 95% CI 1.02–1.23, p = 0.0230), leave-one-out analysis for all variants (which indicated that no single SNP in the instrument unduly influenced the overall value derived from the summary IVW estimate

), and a funnel plot (showing a symmetrical distribution of single-SNP IV estimates around the summary IVW causal estimate). The MR-Egger regression (intercept p = 0.35) provided no evidence to support the presence of directional pleiotropy as a potential source of confounding.

We also identified a potential causal effect of systolic blood pressure (SBP), with higher SBP being linked to higher risk of ED (MR-Egger OR 2.34 per 1 standard deviation higher SBP, 95% CI 1.26–4.36, p = 0.007, with MR-Egger intercept [p = 0.007] suggesting presence of directional pleiotropy). LDL cholesterol (LDL-C) showed minimal evidence of a causal effect (OR 1.07 per 1 standard deviation higher LDL-C, 95% CI 0.98–1.17, p = 0.113), and there was limited evidence to support a role for smoking heaviness or alcohol consumption (Table S15). Genetic risk of coronary heart disease (CHD) showed weak effects on risk of ED, suggesting that pathways leading to CHD may be implicated in ED (OR 1.08, 95% CI 1.00–1.17, p = 0.061). Further, we identified no causal effects of BMI (using a polygenic score or a single SNP in FTO) or education on risk of ED.
Genetic variants may inform drug target validation by serving as a proxy for drug target modulation.

ED is most commonly treated using phosphodiesterase 5 (PDE5) inhibitors such as sildenafil. To identify potential phenotypic effects of PDE5 inhibition (e.g., to predict side effects or opportunities for repurposing), we looked for variants in or around PDE5A, encoding PDE5, which showed association with the ED phenotype. Of all 4,670 variants within a 1 Mb window of PDE5A (chromosome 4:119,915,550–121,050,146 as per GRCh37/hg19), the variant with the strongest association was rs115571325, 26 kb upstream of PDE5A (ORMeta 1.25, nominal p value = 8.46 × 10−4; Bonferroni-corrected threshold [0.05/4,670] = 1.07 × 10−5; Figure S6). Given the weak association with ED, we did not evaluate this variant in further detail.

We have gained insight into ED, a common condition with substantial morbidity, by conducting a large-scale GWAS and performing several follow-up analyses. By aggregating data from 3 cohorts, including 6,175 ED-affected case subjects of European ancestry, we identified a locus associated with ED, with several lines of evidence suggesting SIM1, highly expressed in the hypothalamus, to be the causal gene at this locus. Our findings provide human genetic evidence in support of the key role of the hypothalamus in regulating male sexual function.

Mendelian randomization implicated risk of T2D as a causal risk factor for ED with suggestive evidence for insulin resistance and systolic blood pressure, corroborating well-recognized observational associations with these cardiometabolic traits.

Further research is needed to explore the extent to which drugs used in the treatment of T2D might be repurposed for the treatment of ED. Lack of evidence for a causal effect of BMI on ED risk in MR analysis (using multiple SNPs across the genome) suggests that the association of the lead SNP (rs57989773) with BMI arises from pleiotropy and that the association of this variant with ED risk is independent of its association with adiposity.

In conclusion, in a large-scale GWAS of more than 6,000 ED-affected case subjects, we provide insights into the biological underpinnings of ED and have elucidated causal effects of various risk factors, including pathways involved in the etiology of T2D. Further large-scale GWASs of ED are needed in order to provide additional clarity on its genetic architecture and etiology and to shed light on potential new therapies.