HERSHEY, Pa. — Every disease is shaped by a genetic component as well as environmental factors like air pollution, climate and socioeconomic status. However, the extent to which genetics or environment plays a role in disease risk — and how much can be attributed to each — isn’t well understood. As such, the actions individuals can take to reduce their risk for disease aren’t often clear.
A team led by Penn State College of Medicine researchers found a way to tease apart genetic and environmental effects of disease risk using a large, nationally representative sample. They found that, in some cases, previous assessments overstated the contribution of one’s genes to disease risk and that lifestyle and environmental factors play a larger role than previously believed. Unlike genetics, environmental factors, like exposure to air pollution, can be more easily modified. That means there are potentially more opportunities to mitigate disease risk. The researchers published their work in Nature Communications.
“We’re trying to disentangle how much genetics and how much the environment influences the development of disease. If we more accurately understand how each contributes, we can better predict disease risk and design more effective interventions, particularly in the era of precision medicine,” said Bibo Jiang, assistant professor of public health sciences at the Penn State College of Medicine and senior author of the study.
The researchers said that in the past, it’s been difficult to quantify and measure environmental risk factors since they can encompass everything from diet and exercise to climate. However, if environmental factors aren’t considered in models of disease risk, analyses may falsely attribute the shared disease risks among family members to genetics.
“People living in the same neighborhood share the same level of air pollution, socioeconomic status, access to health care providers and food environment,” said Dajiang Liu, distinguished professor, vice chair for research, director of artificial intelligence and biomedical informatics at the Penn State College of Medicine and co-senior author of the study. “If we can tease apart these shared environments, what’s remaining could more accurately reflect genetic heritability of disease.”
In this study, the team developed a spatial mixed linear effect (SMILE) model that incorporates both genetics and geolocation data. Geolocation — a person’s approximate geographical location — served as a surrogate measure for community-level environmental risk factors.
Using data from IBM MarketScan, a health insurance claims database with electronic health records from more than 50 million individuals from employer-based health insurance policies in the United States, the research team filtered out information for more than 257,000 nuclear families and compiled disease outcomes for 1,083 diseases. They then augmented the data to include publicly available environmental data, including climate and sociodemographic data, as well as levels of particulate matter 2.5 (PM2.5) and nitrogen dioxide (NO2).
The team’s analysis led to more refined estimates of the contributors to disease risk. For example, previous studies concluded that genetics contributed 37.7% of the risk of developing Type 2 diabetes. When the research team reassessed the data, their model, with its consideration of environmental effects, found that the estimated genetic contribution to Type 2 diabetes risk decreased to 28.4%; a bigger share of disease risk can be attributed to environmental factors. Similarly, estimated contribution to obesity risk attributed to genetics decreased from 53.1% to 46.3% when adjusted for environmental factors.
“Previous studies concluded that genetics played a much larger role in disease risk prediction, and our study recalibrated those numbers,” Liu said. “That means that people can stay hopeful even though they have family relatives with Type 2 diabetes, for example, because there's a lot they can do to reduce their own risk.”
The research team also used the data to quantitatively assess whether two specific pollutants in the air — PM2.5 and NO2 — causally influence disease risks. Previous studies, the researchers said, lump PM2.5 and NO2 together as one collective measure of air pollution. However, what they found in this study was that the two pollutants have different and distinct causal relationships with health conditions. For instance, NO2 is shown to directly cause conditions like high cholesterol, irritable bowel syndrome and both Type 1 and Type 2 diabetes, but not PM2.5. PM2.5, on the other hand, may have a more direct causal effect on lung function and sleep disorders.
Ultimately, the researchers said this model will allow for a more in depth look at questions about why some diseases may be more prevalent in certain geographic locations.
Other Penn State authors on the paper include: Havell Markus and Austin Montgomery, both joint medical degree and doctoral degree students at the Penn State College of Medicine and the Fox Graduate School at Penn State; Laura Carrel, professor of biochemistry and molecular biology; Arthur Berg, professor of public health sciences; and Qunhua Li, professor of statistics. Daniel McGuire, who was a doctoral student in the biostatistics program at the time of the research, co-led the study. Co-author Lina Yang and Jingyu Xu, who were doctoral students in the biostatistics program at the time of the research, also contributed to the paper.
The National Institutes of Health and the Penn State College of Medicine’s artificial intelligence and biomedical informatics pilot funding program supported this work in part. Some of the materials employed in this work were provided by the Center for Applied Studies in Health Economics at the Penn State College of Medicine.