UNIVERSITY PARK, PA. — Newly developed risk scores synthesize genetic information into an easy-to-interpret metric that could help clinicians identify young children most at risk of developing obesity.
The study, led by researchers at Penn State, used novel statistical methods to establish scoring criteria using data collected from young children. The research also demonstrates that robust results are attainable from studies that are orders-of-magnitude smaller than typical genetic studies when comprehensive data is collected over time and used in conjunction with powerful statistical tools.
“About 18% of children in the United States are obese, and 6% are severely obese,” said Sarah Craig, assistant research professor of biology at Penn State. “If we can identify children most at risk, we might be able to prevent obesity from developing in the first place. In this study, we produced risk scores based on genetic information that clinicians could potentially use to identify young children who would most benefit from intervention strategies."
This study is part of a larger project called INSIGHT (Intervention Nurses Start Infants Growing on Healthy Trajectories), coordinated through the Penn State Health Milton S. Hershey Medical Center, in which researchers and clinicians work together to identify biological and social risk factors for obesity and the impacts of responsive parenting interventions during a child’s early life. The research team collected longitudinal data — periodically 8 times between birth and three years of age — including weight, height, and behavioral and environmental variables—on nearly 300 children. They also collected a blood sample for genetic analyses from each of the children, which served as the basis for developing risk scores. The team published their results in a paper appearing in the journal Econometrics and Statistics.
The risk scores — called “polygenic risk scores” because they are based on many genetic locations across the genome — distill vast genetic information into an easy-to-grasp number. Typically, the scores incorporate information from a number of single nucleotide polymorphisms (SNPs), or locations in the genome where single letters of the DNA alphabet can vary among people, that are most related to the metrics of interest — in this case, growth rates and obesity.
“Previous attempts to produce polygenic risk scores for obesity were developed using genetic information from adults or older children and include anywhere from a hundred to two million SNPs,” said Kateryna Makova, professor of biology and Verne M. Willaman Chair of Life Sciences at Penn State. “Such high numbers are challenging and potentially expensive to consistently reproduce, especially in a clinical setting. We produced two score options with far fewer SNPs — one with 24 and one with 5 — that nonetheless can provide valuable information to researchers and clinicians.”
The research team used novel statistical techniques from a field called functional data analysis to identify the SNPs most related to obesity, which were then incorporated into the scores.
“Unlike many genetic studies, which collect data on a single measurement, like for instance body mass index — BMI, and at a single point of time, we took advantage of the longitudinal data collected over time,” said Francesca Chiaromonte, professor of statistics and Huck Chair in Statistics for the Life Sciences at Penn State. “Several measurements of weight and height over time yield a growth curve for each child, and we can analyze the shapes of the curves for the children in our cohort using functional data analysis. We took advantage of this richer data at every step of the analysis.”