Dr. George Busby, PhD
Article authored and provided by Allelica as part of a paid partnership with NSGC. The content, views and opinions expressed in this article are those of Allelica, and do not necessarily reflect the opinions and views of the National Society of Genetic Counselors
From mediaeval philosophers such as John Locke, to grandparents cooing over a newborn, the debate over the relative effect of nature and nurture on how we look — and are — has raged for centuries. Although the arguments continue, it is clear that very few, if any, human traits are defined by one or the other. We are all the result of a complex interplay of our nature (our genes) and nurture (our environment).
When it comes to the relationship between our DNA and our risk of disease, we have learned an incredible amount about how variation in the genome can lead to variation in susceptibility to disease. For example, the American College of Medical Genetics and Genomics lists 73 genes where there is sufficient evidence of association with disease risk to indicate clinical action.
The challenge with looking for mutations in these genes is that they are rare at a population level. As a rule of thumb, variants with a large effect on disease risk are more rare in the population. So, while we can test for the presence of these mutations in anyone, most people will not carry one. Moreover, there is still a significant amount of genetic risk of disease that comes from other types of variation in the human genome.
Polygenic risk scores, or PRSs, are gaining increasing momentum as a way to estimate the genetic risk of common disease based on multiple, more common variants in the genome. Large datasets have allowed us to estimate the effects of these common variants, which individually might be very small, but when summed across a genome can be combined into powerful predictors of disease risk.
A PRS is a representation of an individual’s genetic risk of common disease and can be generated for many diseases from genetic data. The scores are compared to a distribution of scores in a population of individuals with known disease status and the increase or decrease in risk of disease can be calculated. This risk can be relative to the average, represented as a fold increase or decrease in risk, or can be input into more sophisticated models that include additional clinical covariates known to influence risk. The absolute risk of disease can also be estimated, providing insight into how different risk factors combine into an overall assessment of lifetime risk of disease.
Not all PRSs are equal, however. A central challenge to implementing PRSs in clinical practice is to ensure that their use is equitable across different individuals and ethnicities. The vast majority of publicly available genetic data has been generated from populations with European ancestry. For a number of reasons, the performance of PRSs generated from these data can differ by population.
These reasons fall into three broad categories. The first is that PRSs are based on common variants with small effects. The frequency of alleles at these variants can vary across populations because of differences in their ancestral history, rather than differences in disease risk, which can lead to biases in individual level PRS predictions. One way to overcome this is to use principal components — a measurement of genetic ancestry — to adjust a raw PRS and align it with an underlying risk distribution.
The second major issue for translating PRSs across ancestries is that the genetic associations used to build predictive genetic models are often not disease causing themselves. Often they are close by in the genome and tag the causal variants. Such tagging variants can be population specific and so have reduced power to predict risk in populations where their frequency differs from the original association dataset. Novel methodology, including the use of finemapping techniques that use external datasets with predicted functional significance, can help to home in on causal variants thereby reducing the effect of differing tag variants among populations.
Thirdly, we need more data from diverse populations with linked clinical and medical data. To do this equitably, it is essential to engage these populations in community-based participatory research to ensure that data collection is consistent with the latest ethical and societal norms and that the benefits of these collaborations are understood and shared widely. These efforts are underway and there is every hope that increasing the use of equitably shared data from under-represented populations can reduce the diversity gap that exists, not just in genomic medicine, but across healthcare as a whole.
Despite these challenges PRSs can be used across ancestries, so long as they have been appropriately validated and calibrated in different populations. Validation provides an objective assessment of the effect of a PRS in a population of interest — measured as an Odds Ratio per standard deviation — and calibration tells you how accurately the PRS predicts risk in individuals with known outcome.
PRSs present an incredible opportunity for us to further genomic prevention through the application of cutting-edge science to disease risk assessments. While there will always be a discussion about the balance of our innate differences and our environment on our risk of disease, with PRSs we can begin to identify the key role that our genes play in disease risk and use this to move towards a more personalized and preventive approach to healthcare.