Close
About
FAQ
Home
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Modeling the minor allele frequency and linkage disequilibrium joint architectures of human diseases and complex traits
(USC Thesis Other)
Modeling the minor allele frequency and linkage disequilibrium joint architectures of human diseases and complex traits
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MODELING THE MINOR ALLELE FREQUENCY AND LINKAGE DISEQUILIBRIUM
JOINT ARCHITECTURES
OF HUMAN DISEASES AND COMPLEX TRAITS
by
Changqing Su
A Thesis Presented to the
FACULTY OF THE USC KECK SCHOOL OF MEDICINE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
BIOSTATISTICS
December 2021
Copyright 2021 Changqing Su
ii
Table of Contents
List of Figures ................................................................................................................................ iii
Abstract .......................................................................................................................................... iv
Chapter 1: Introduction ................................................................................................................... 1
Chapter 2: Method .......................................................................................................................... 3
1. History of heritability models .............................................................................................. 3
2. Stratified LD score regression ............................................................................................. 5
3. The baseline-MAFxLD model ............................................................................................. 6
4. The baseline-MAFxSelection model ................................................................................... 6
Chapter 3: Results ........................................................................................................................... 8
1. The baseline-LD model is robust to different effect of LD scores in MAF bins. ................ 8
2. The baseline-LD does not capture the different effect of selection coefficients in different
MAF bins. ................................................................................................................................. 10
3. Modeling functional annotation MAF dependent effects highlight that the baseline-LD
slightly underestimate functional enrichment. .......................................................................... 12
Chapter 4: Conclusion................................................................................................................... 13
References ..................................................................................................................................... 14
iii
List of Figures
Figure 1: Heritability partitioning results of the baseline-MAFxLD model.. ................................. 8
Figure 2: Functional enrichment and heritability estimates estimated under the baseline-MAFxLD
models. ............................................................................................................................................ 9
Figure 3: Heritability partitioning results of the baseline-MAFxSelection model. ...................... 11
Figure 4: Functional enrichment and heritability estimates estimated under the baseline-
MAFxSelection models. ............................................................................................................... 11
Figure 5: Functional enrichment estimated under the baseline-LD-alpha models ....................... 12
iv
Abstract
Recent works have highlighted strong minor allele frequency (MAF) and linkage disequilibrium
(LD)-dependent genetic architectures of human diseases and complex traits (where SNPs that are
more common and that having a lower level of LD explain more heritability), and how these
architectures are linked to the action of negative selection. Accounting for such architectures in
polygenic analyses of human diseases is crucial to provide unbiased estimates of heritability and
functional enrichment. The baseline-LD model has been highlighted as the state-of-the-art model
to account for such architectures. It accounts for these architectures by modeling MAF and LD
effects independently. However, it is still currently unknown if this model fully captures the effects
of MAF and LD, and if MAF and LD have an interactive effect, due to their joint contribution to
negative selection. Here we explored different MAF and LD architectures by adding extra
annotations to the baseline-LD model, and by estimating the parameters of these models by using
stratified LD score regression on 63 independent GWAS. We observed that more realistic
modeling of negative selection could improve model fit, but that the baseline-LD model estimates
of heritability and heritability enrichment are robust to this model misspecification.
1
Chapter 1: Introduction
Over the past decade, ever-larger genome-wide association studies (GWAS) have changed
our understanding of the architectures of human complex diseases and traits. They notably
highlighted that these architectures are highly polygenic and dominated by thousands of common
variants with weak effects (Visscher et al. 2017). Statistical methods relying on this polygenic
architecture have yielded rich insights into the genetic architectures of human complex traits. They
notably allowed us to estimate heritability of a trait (Yang et al. 2010) and to partition this
polygenic signal (i.e. heritability) across functional genomic annotations (Finucane et al. 2015;
Gusev et al. 2014).
Recent works have highlighted strong minor allele frequency (MAF) and linkage
disequilibrium (LD)-dependent genetic architectures of human diseases, where SNPs that are more
common and that having a lower level of LD explain more heritability (Gazal et al. 2017; Speed
et al. 2012, 2017; Speed and Balding 2019; Yang et al. 2015). These architectures have been linked
to the action of negative selection on human complex traits (Gazal et al. 2017).
Accounting for MAF and LD-dependent genetic architectures in polygenic analyses of
human diseases is crucial to provide unbiased estimates of heritability and functional enrichment,
and jointly accounting for these architectures remains a matter of considerable debate (Gazal et al.
2017, 2019; Speed et al. 2012, 2017; Speed and Balding 2019). The state-of-the-art model to
account for such architectures is called the baseline-LD model (Gazal et al. 2019; Speed, Holmes,
and Balding 2020). It accounts for functional, MAF and LD-dependent genetic architectures by
partitioning the genome into functional regions, MAF bins, and by considering several annotations
related to LD, such as the age of a SNP (Gazal et al. 2017). However, other models have considered
an interactive effect of MAF and LD, i.e. where the strength of the effect of LD depends of its
2
MAF, or MAF-dependent effect within functional annotations (Speed et al. 2017; Speed and
Balding 2019; Speed, Holmes, and Balding 2020). Accounting for more precise MAF and LD-
dependent genetic architectures into the baseline-LD model could improve estimates of heritability
and heritability enrichment.
Here we will explore different MAF and LD architectures by adding extra annotations to the
baseline-LD model, and by estimating the parameters of these models by using stratified LD score
regression on 63 independent GWAS. We will compare their functional enrichment and heritability
results to the ones obtained by the baseline-LD model. Our results suggest that more realistic modeling
of negative selection could improve model fit, but the baseline-LD model estimates of heritability and
heritability enrichment are robust to this model misspecification.
3
Chapter 2: Method
1. History of heritability models
Suppose we have a sample with 𝑁 individuals, and a vector 𝑦 = (𝑦 1
, … , 𝑦 𝑁 ) which are
quantitative phenotypes, standardized to mean 0 and variance 1, and the model
𝑦 = 𝑋𝛽 + 𝜀 (1)
where 𝑋 is a 𝑁 x 𝑀 matrix of standardized genotypes, 𝛽 = (𝛽 1
, … , 𝛽 𝑀 ) is the vector of per
normalized genotype effect size, and 𝜀 = (𝜀 1
, … , 𝜀 𝑁 ) is a mean-0 vector of residuals with variance
𝜎 𝑒 2
(Gazal et al. 2017).
We will define different heritability models according to 𝑉𝑎𝑟 (𝛽 𝑗 ), the variance of effect
sizes 𝛽 𝑗 of standardized genotype for each SNP j. We note that 𝑉𝑎𝑟 (𝛽 𝑗 ) is equal to the variance
of per-allele causal effect sizes times 2𝑝 𝑗 (1 − 𝑝 𝑗 ) where 𝑝 𝑗 is minor allele frequency (MAF), and
can also be interpreted as the expected causal per-SNP heritability (h
2
) of SNP j (Gazal et al. 2019).
The first heritability model considered an infinitesimal model (Visscher, Hill, and Wray
2008) in which standardized effect size variances are constant:
𝑉𝑎𝑟 (𝛽 𝑗 ) ∝ 1 (2)
This model is called the “GCTA model” in ref. (Speed et al. 2017)
Next, models considered MAF-dependent architectures, where 𝑉𝑎𝑟 (𝛽 𝑗 ) depends on the
frequency 𝑝 𝑗 of SNP 𝑗 , and an 𝛼 parameter (Lee et al. 2013; Speed et al. 2012b):
𝑉𝑎𝑟 (𝛽 𝑗 ) ∝ (𝑝 𝑗 (1 − 𝑝 𝑗 ))
1+𝛼
(3)
Note that 𝛼 = −1 is equivalent to the GCTA model, and that a recent paper estimated that 𝛼 ~
0.38 across 25 human traits (Schoech et al. 2019), consistent with other selection models (Mancuso
et al. 2016; Zeng et al. 2018).
4
Speed et al. suggested an LD-dependent architectures (Speed et al. 2012, 2017), and
introduced an “LDAK model” that models both LD-dependent and MAF-dependent architectures
(Speed et al. 2017):
𝑉𝑎𝑟 (𝛽 𝑗 ) ∝ (𝑝 𝑗 (1 − 𝑝 𝑗 ))
1+𝛼 𝑤 𝑗
(4)
where 𝑤 𝑗 denote LDAK weights reflecting smaller expected causal per-SNP heritability for high-
LD SNPs.
Gazal et al. introduced a “baseline-LD” model that models LD-dependent, MAF-dependent
and functional architectures (Gazal et al. 2017) by considering the additive effect of D functional
annotations:
𝑉𝑎𝑟 (𝛽 𝑗 ) = ∑ 𝜏 𝑑 𝑎 𝑑 (𝑗 )
𝐷 𝑑 =1
(5)
where the coefficients 𝜏 𝑑 denote conditional contributions of annotations 𝑎 𝑑 to expected per-SNP
heritability. The D annotations 𝑎 𝑑 include 6 continuous-valued LD-related annotations, 10
common SNP MAF bins (starting at MAF ≥ 5%), and overlapping functional annotations (e.g.
coding, conserved, regulatory). Here, we used the baseline LD v2.2, which contains a total of 97
annotations, including 44 main functional binary annotations.
5
2. Stratified LD score regression
Stratified LD score regression (S-LDSC) is a method considering any model modeling 𝛽
as a mean-0 vector whose variance depends on 𝐷 continuous-valued annotations 𝑎 1
, … , 𝑎 𝐷 :
𝑉𝑎𝑟 (𝛽 𝑗 ) = ∑ 𝜏 𝑑 𝑎 𝑑 (𝑗 )
𝐷 𝑑 =1
(5)
Under this model,
𝐸 [𝜒 𝑗 2
] = 𝑁 ∑ 𝜏 𝑑 𝑙 (𝑗 , 𝑑 )
𝑑 + 1
(6)
where 𝑙 (𝑗 , 𝑑 ) = ∑𝑎 𝑑 (𝑘 )𝑟 𝑗𝑘
2
𝑘 is the LD score of SNP 𝑗 with respect to annotations 𝑎 𝑑 and 𝑟 𝑗𝑘
is the
correlation between SNPs 𝑗 and 𝑘 . Given a vector of 𝜒 2
statistics and LD scores computed from a
reference sample (here sequenced European individuals from 1000 Genomes)(Auton et al. 2015),
this equation allows us to obtain estimates 𝜏 ̂ 𝑑 of 𝜏 𝑑 (Finucane et al. 2015)), and then estimate the
heritability explained by a set of SNP 𝐽 as
ℎ
𝑔 2
(𝐽 ) = ∑ ∑ 𝑎 𝑗 ,𝑐 𝜏 𝑐 𝑐 𝑗 ∈𝐽
(7)
S-LDSC estimates the enrichment of a functional annotation as the proportion of
heritability explained by common SNPs (MAF ≥ 5%) divided by the proportion of SNPs in the
annotation. S-LDSC estimates the heritability (h
2
) of a trait as the sum of 𝑉𝑎𝑟 (𝛽 𝑗 ) for SNPs with
a MAF ≥ 5%.
Application of S-LDSC was performed using Finucane et al. (Finucane et al. 2015)
guidelines and was restricted to data sets of European ancestry. Standard error of 𝜏 ̂ 𝑐 and
ℎ
𝑔 2
(𝐽 ) were computed using a block jackknife over SNPs with 200 equally sized blocks of adjacent
SNPs (Finucane et al. 2015). S-LDSC was applied on 63 independent diseases and complex traits
(Gazal et al. 2021).
6
For further analyses, we hypothesized that any model including the annotations of the
baseline-LD model and extra annotations, provide overall more accurate heritability estimates than
the baseline-LD model.
3. The baseline-MAFxLD model
The baseline-LD considers additive effect of MAF and LD annotations into 𝑉𝑎𝑟 (𝛽 𝑗 ).
However, ones can assume different effect of LD in common versus less common variants. To test
this hypothesis, we create a baseline-MAFxLD model, containing all baseline-LD annotations
without the 10 MAF bins (97 – 10 = 87), and 16 annotations stratifying LD scores into 4 LD bins
for 4 MAF bins (103 annotations in total).
4. The baseline-MAFxSelection model
Gazal et al. demonstrated that LD architectures are related to negative selection, as multiple
LD annotations tag the deleterious effect of a variant (Gazal et al. 2017). Rather than investigating
a simple LD architecture, we focused here on estimates of selection coefficient (s) for each
common variant (Gazal et al. unpublished data). We thus introduced a baseline-MAFxSelection
model, which contains the baseline-LD annotations without the 10 MAF bins, and 16 annotations
stratifying selection coefficients into 4 selection coefficient bins with 4 MAF bins (103 annotations
in total).
5. The baseline-LD-alpha model
Recent work highlighted different MAF architecture with different functional annotations
(Gazal et al. 2018), where annotations under stronger negative selection have rare variants with
7
higher effect sizes than common variants. As the strength of selection for each annotation is
unknown, we considered a baseline-LD-alpha model, which multiplies the baseline-LD model
annotations (equation (5)) with the 𝛼 parameter from equation (3) for 5 different values of 𝛼 (485
annotations in total):
𝑉𝑎𝑟 (𝛽 𝑗 ) = ∑ [(𝑝 𝑗 (1 − 𝑝 𝑗 ))
1+𝛼 ∑ 𝜏 𝑑 ,𝛼 𝑎 𝑑 (𝑗 )
𝐷 𝑑 =1
]
𝛼 ∈{0,0.25,0.5,0.75,1}
(8)
Note that this equation can be re-written to estimate 𝜏 𝑑 ,𝛼 using S-LDSC framework as
𝑉𝑎𝑟 (𝛽 𝑗 ) = ∑ ∑ 𝜏 𝑑 ,𝛼 [(𝑝 𝑗 (1 − 𝑝 𝑗 ))
1+𝛼 𝑎 𝑑 (𝑗 )]
𝛼 ∈{0,0.25,0.5,0.75,1}
𝐷 𝑑 =1
(9)
by having 485 annotations under the form (𝑝 𝑗 (1 − 𝑝 𝑗 ))
1+𝛼 𝑎 𝑑 (𝑗 ).
8
Chapter 3: Results
1. The baseline-LD model is robust to different effect of LD scores in MAF bins.
We ran S-LDSC with the baseline-MAFxLD models on 63 independent traits and meta-
analyzed its results. We observed that SNPs with low LD explain more heritability within each
MAF bin (Figure 1a), and confirmed that overall SNPs with high MAF and low LD explain more
heritability (Figures 1b and 1c).
Figure 1: Heritability partitioning results of the baseline-MAFxLD model. We report heritability observed under the baseline-
MAFxLD model and expected under the baseline-LD for 16 MAFxLD bins (a), across the 4 LD bins of each MAF bin (b), and
across the 4 MAF bins of each LD bin (c).
We next compared what was the expected proportion of heritability for the 16 MAFxLD
bins of the baseline-MAFxLD model under the baseline-LD model (this value was computed using
per-SNP heritability computed using 𝜏 coefficients from the baseline-LD model; standard errors
were computed using block jackknife). Only 2 out of 16 bins had a proportion of heritability
significantly difference (P < 0.05/16) between the 2 models (MAF bin 1 – LD bin 2, 𝑃 =
4.84 × 10
−7
, and MAF bin 2 – LD bin 3, 𝑃 = 2.39 × 10
−4
). Overall, the baseline-LD model
over-estimates the effects of LD in the first MAF bin, and under-estimates the effects of LD in the
last MAF bins.
9
Finally, we also investigated the impact of the 16 MAFxLD bins into functional enrichment
and heritability estimates. We observed nearly identical functional enrichments when using the
baseline-LD and baseline-MAFxLD models (Figure 2a). We also observed very similar heritability
estimates, except one out of 63 traits (Anorexia) with significant different heritability between the
2 models (h
2
= 0.24, s.e. = 0.03 with the baseline-LD model, h
2
= 0.40, s.e. = 0.4 with the baseline-
MAFxLD model, P = 5.49 x 10
-4
for difference; Figure 2b).
Figure 2: Functional enrichment and heritability estimates estimated under the baseline-MAFxLD models. We report
functional enrichments for 44 main functional binary annotations of the baseline-LD model (a), and heritability estimates (h
2
) of
63 independent traits with their corresponding confidence interval (b) estimated under the baseline-LD model (x-axis) and the
baseline-MAFxLD model (y-axis). We only observed one out of 63 traits (Anorexia) with significant different heritability between
the 2 models.
Overall, we observed that the baseline-LD model is robust to an LD architecture with
different effects across MAF bins.
10
2. The baseline-LD does not capture the different effect of selection coefficients in different
MAF bins.
We next compared results expected from the baseline-LD to the ones observed with the
baseline-MAFxSelection model on 16 bins of MAF and selection coefficients. Unlike with the
baseline-MAFxLD model, we observed very different heritability between the ones expected for
baseline-LD and observed with baseline-MAFxSelection models (Figure 3). Indeed, under the
baseline-MAFxSelection model, heritability patterns of deleterious bins were non-linear within
each MAF bin (Figure 3a). We observed that the baseline-LD model tends to overestimate the
heritability of deleterious SNP (low selection coefficient) in the low frequency MAF bin, and
strongly underestimates heritability of deleterious SNP in high frequency MAF bins (Figures 3a
and 3c). For example, in the most common MAF bin, we observed that the most deleterious SNPs
explain 16.82% (s.e. = 0.44%) of heritability with the baseline-MAFxSelection model vs 13.03%
(s.e. = 0.14%) of heritability with the baseline-LD model (1.29 ratio, P = 2.08 × 10
−16
for
difference), while the second most deleterious SNP bin explain 7.01% (s.e. = 0.50%) of
heritability with the baseline-MAFxSelection model vs 9.74% (s.e. = 0.09%) of heritability with
the baseline-LD model (0.72 ratio, P = 7.68 × 10
−8
for difference). The strongest difference was
observed for the less common and less deleterious SNPs (9.09 × 10
−17
). Overall, we observed
very high heritability explained by deleterious SNPs: after stratifying by MAF, the top 25% of
deleterious SNPs explain nearly half heritability (48.28%, s.e. = 0.06%) (Figure 3c).
Finally, we also investigated the impact of the 16 MAFxSelection bins into functional
enrichment and heritability estimates. We observed nearly identical functional enrichments and
heritability estimates when using the baseline-LD and baseline-MAFxLD models (Figure 4).
11
Figure 3: Heritability partitioning results of the baseline-MAFxSelection model. We report heritability observed under the
baseline-MAFxSelection model and expected under the baseline-LD for 16 MAFxSelection bins (a), across the 4 selection
coefficient bins of each MAF bin (b), and across the 4 MAF bins of each selection coefficient bin (c).
Figure 4: Functional enrichment and heritability estimates estimated under the baseline-MAFxSelection models. We report
functional enrichments for 44 main functional binary annotations of the baseline-LD model (a), and heritability estimates (h
2
) of
63 independent traits with their corresponding confidence interval (b) estimated under the baseline-LD model (x-axis) and the
baseline-MAFxSelection model (y-axis).
Overall, we observed that the baseline-LD model does not completely model the impact of
selection on the genetic architecture of human complex traits. However, this model
misspecification does not impact functional enrichments and heritability estimates.
12
3. Modeling functional annotation MAF dependent effects highlight that the baseline-LD
slightly underestimate functional enrichment.
Finally, we investigated the impact of different selection pressure among functional
annotations by comparing functional enrichment obtained under the baseline-LD model and under
the baseline-LD-alpha model. We observed that the baseline-LD model tends to underestimate
heritability enrichment of highly enriched annotations (by a factor of 0.88). No annotation has
significant different enrichment after correction for multiple testing (P < 0.05/44), but 4 conserved
annotations had significant differences at the P < 0.05 level (P between 0.005 and 0.02) (Figure
5).
Figure 5: Functional enrichment estimated under the baseline-LD-alpha models. We report functional enrichments for 44
main functional binary annotations of the baseline-LD model with their corresponding confidence interval estimated under the
baseline-LD model (x-axis) and the baseline-LD-alpha model (y-axis). Significant differences (P < 0.05) are highlighted in red; no
annotation has significant different enrichment after correction for multiple testing (P < 0.05/44).
Overall, these results suggest that modeling selection of a functional annotations might be
necessary to improve functional enrichment estimates.
13
Chapter 4: Conclusion
In this work, we explored different MAF and LD architectures by adding extra annotations
to the baseline-LD model. Our work suggests that MAF and LD architectures are well captured by
the baseline-LD model, but that this model does not capture a more complex modeling related to
the action of negative selection itself. While modeling the action of negative selection genome-
wide did not impact the estimates of heritability and functional heritability enrichments, we
observed that modeling the action of negative selection on the functional annotation is necessary
to improve functional heritability enrichments of highly enriched annotations.
Overall, this work demonstrates the robustness of the baseline-LD model, and provides
results for new research directions.
14
References
Auton, Adam et al. 2015. “A Global Reference for Human Genetic Variation.” Nature 526(7571):
68–74.
Finucane, Hilary K. et al. 2015. “Partitioning Heritability by Functional Annotation Using
Genome-Wide Association Summary Statistics.” Nature Genetics 47(11): 1228–35.
Gazal, Steven et al. 2017. “Linkage Disequilibrium-Dependent Architecture of Human Complex
Traits Shows Action of Negative Selection.” Nature Genetics 49(10): 1421–27.
———. 2018. “Functional Architecture of Low-Frequency Variants Highlights Strength of
Negative Selection across Coding and Non-Coding Annotations.” Nature Genetics 50(11):
1600–1607.
———. 2021. Combining SNP-to-Gene Linking Strategies to Pinpoint Disease Genes and Assess
Disease Omnigenicity.
https://www.medrxiv.org/content/10.1101/2021.08.02.21261488v1 (September 20, 2021).
Gazal, Steven, Carla Marquez-Luna, Hilary K. Finucane, and Alkes L. Price. 2019. “Reconciling
S-LDSC and LDAK Functional Enrichment Estimates.” Nature Genetics 51(8): 1202–4.
Gusev, Alexander et al. 2014. “Partitioning Heritability of Regulatory and Cell-Type-Specific
Variants across 11 Common Diseases.” American Journal of Human Genetics 95(5): 535–
52.
Mancuso, Nicholas et al. 2016. “The Contribution of Rare Variation to Prostate Cancer
Heritability.” Nature Genetics 48(1): 30–35.
Schoech, Armin P. et al. 2019. “Quantification of Frequency-Dependent Genetic Architectures in
25 UK Biobank Traits Reveals Action of Negative Selection.” Nature Communications
10(1): 790.
Speed, Doug et al. 2017. “Reevaluation of SNP Heritability in Complex Human Traits.” Nature
Genetics 49(7): 986–92.
Speed, Doug, and David J. Balding. 2019. “SumHer Better Estimates the SNP Heritability of
Complex Traits from Summary Statistics.” Nature Genetics 51(2): 277–84.
Speed, Doug, Gibran Hemani, Michael R. Johnson, and David J. Balding. 2012. “Improved
Heritability Estimation from Genome-Wide SNPs.” The American Journal of Human
Genetics 91(6): 1011–21.
Speed, Doug, John Holmes, and David J. Balding. 2020. “Evaluating and Improving Heritability
Models Using Summary Statistics.” Nature Genetics 52(4): 458–62.
Visscher, Peter M. et al. 2017. “10 Years of GWAS Discovery: Biology, Function, and Translation.”
The American Journal of Human Genetics 101(1): 5–22.
15
Yang, Jian et al. 2010. “Common SNPs Explain a Large Proportion of the Heritability for Human
Height.” Nature Genetics 42(7): 565–69.
———. 2015. “Genetic Variance Estimation with Imputed Variants Finds Negligible Missing
Heritability for Human Height and Body Mass Index.” Nature Genetics 47(10): 1114–20.
Zeng, Jian et al. 2018. “Signatures of Negative Selection in the Genetic Architecture of Human
Complex Traits.” Nature Genetics 50(5): 746–53.
Abstract (if available)
Abstract
Recent works have highlighted strong minor allele frequency (MAF) and linkage disequilibrium (LD)-dependent genetic architectures of human diseases and complex traits (where SNPs that are more common and that having a lower level of LD explain more heritability), and how these architectures are linked to the action of negative selection. Accounting for such architectures in polygenic analyses of human diseases is crucial to provide unbiased estimates of heritability and functional enrichment. The baseline-LD model has been highlighted as the state-of-the-art model to account for such architectures. It accounts for these architectures by modeling MAF and LD effects independently. However, it is still currently unknown if this model fully captures the effects of MAF and LD, and if MAF and LD have an interactive effect, due to their joint contribution to negative selection. Here we explored different MAF and LD architectures by adding extra annotations to the baseline-LD model, and by estimating the parameters of these models by using stratified LD score regression on 63 independent GWAS. We observed that more realistic modeling of negative selection could improve model fit, but that the baseline-LD model estimates of heritability and heritability enrichment are robust to this model misspecification.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Understand the distinct patterns of selection in auto-immune diseases with ancient DNA data by the S-LDSC model
PDF
Understanding ancestry-specific disease allelic effect sizes by leveraging multi-ancestry single-cell RNA-seq data
PDF
A global view of disparity in imputation resources for conducting genetic studies in diverse populations
PDF
Scalable latent factor models for inferring genetic regulatory networks
PDF
Characterizing synonymous variants by leveraging gene expression and GWAS datasets
PDF
Variants in MTNR1B and CDKAL1 contributes independent additive effects to GDM-related traits in Mexican Americans
PDF
Missing heritability may be explained by the common household environment and its interaction with genetic variation
PDF
The influence of DNA repair genes and prenatal tobacco exposure on childhood acute lymphoblastic leukemia risk: a gene-environment interaction study
PDF
Leveraging functional datasets of stimulated cells to understand the relationship between environment and diseases
PDF
Analysis of SNP differential expression and allele-specific expression in gestational trophoblastic disease using RNA-seq data
PDF
The risk estimates of pneumoconiosis and its relevant complications: a systematic review and meta-analysis
PDF
Comparison of participant and study partner predictions of cognitive impairment in the Alzheimer's disease neuroimaging initiative 3 study
PDF
Using genetic ancestry to improve between-population transferability of a prostate cancer polygenic risk score
PDF
Improving the power of GWAS Z-score imputation by leveraging functional data
PDF
Cell-specific case studies of enhancer function prediction using machine learning
PDF
Association of comorbidity with prostate cancer tumor characteristics in African American men
PDF
Polygenic analyses of complex traits in complex populations
PDF
twas_sim, a Python-based tool for simulation and power analysis of transcriptome-wide association analysis
PDF
Covariance-based distance-weighted regression for incomplete and misaligned spatial data
PDF
The carcinogenic effect of the MMP9 rs3918242 polymorphism on the risk of cancer of the digestive system: evidence from a meta-analysis
Asset Metadata
Creator
Su, Changqing
(author)
Core Title
Modeling the minor allele frequency and linkage disequilibrium joint architectures of human diseases and complex traits
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biostatistics
Degree Conferral Date
2021-12
Publication Date
11/03/2021
Defense Date
10/07/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
allele frequency and linkage disequilibrium,genetic architecture,heritability,human disease,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Gazal, Steven (
committee chair
), Chiang, Charleston (
committee member
), Mancuso, Nicholas (
committee member
)
Creator Email
changqin@usc.edu,sscq1995@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC16351988
Unique identifier
UC16351988
Legacy Identifier
etd-SuChangqin-10202
Document Type
Thesis
Format
application/pdf (imt)
Rights
Su, Changqing
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
allele frequency and linkage disequilibrium
genetic architecture
heritability
human disease