Research

Researchers compile Cacao Gene Atlas to help plant breeders boost chocolate tree

The Cacao Gene Atlas is a genomics resource freely available to the public. Researchers can use the atlas to simulate how knocking out or enhancing a gene may influence the chocolate tree, allowing them to test hypotheses without the expense or time of growing the plant.  Credit: Mark Guiltinan/Penn State. All Rights Reserved.

UNIVERSITY PARK, Pa. — Cacao, the chocolate tree, is one of the world’s most important economic crops, generating hundreds of billions of dollars annually. However, cocoa is affected by a range of pests and diseases, with some estimates putting losses as high as 30% to 40% of global production. Now, a team led by researchers at Penn State has created a genetic information resource to help plant breeders develop resistant strains of cacao that can be grown sustainably in its native Amazon and elsewhere, such as the tropical latitudes of Central and South America, the Caribbean, Africa and Asia.

In findings published today (June 26) in BMC Plant Biology, the team described the Cacao Gene Atlas, a huge, compiled dataset of replicated transcriptomes the researchers started in 2016. Transcriptomes are the protein-coding part of the organism's genome that can be analyzed by researchers to determine when and where each gene is turned on or off in cells and tissues. The researchers’ long-term goal is to accelerate breeding to develop high-yielding elite varieties of cacao.

The Cacao Gene Atlas, which contains 11.2 million gene expression data points, is a genomics resource freely available to the public, according to team leader Mark Guitinan, J. Franklin Styer Professor of Horticultural Botany and professor of plant molecular biology in Penn State’s College of Agricultural Sciences. Researchers can use the atlas to simulate how knocking out or enhancing a gene may influence the plant, allowing them to test hypotheses without the expense or time of growing the plant.

“In seconds, the atlas can be used to display expression patterns of any cacao gene throughout the lifecycle of cacao — this will aid in the discovery of genes for important traits and to accelerate breeding for new varieties of cacao far into the future,” he said. “The atlas can be expanded in the future by adding new data as it becomes available. Ultimately, we hope this serves the cacao farmers of the world by helping them combat plant disease and to produce quality cocoa beans for industry.”

Cacao is sometimes called an orphan crop because there are not as many genetic resources readily available for the chocolate tree as for crops such as corn, rice or cotton. This research changes that dynamic, according to co-first author Evelyn Kulesza, doctoral degree candidate in plant biology.   

“For those other crops, genomic resources already exist for scientists to use for various gene analyses,” she said. “We created this resource that allows many other cacao scientists around the world to easily conduct research. We expanded the knowledge base, and it allows other scientists who maybe don't have the monetary support or lab space to conduct experiments to test how different genes behave.”

The team, spearheaded by Kulesza and co-first author Patrick Thomas, postdoctoral researcher in Penn State’s Department of Plant Science, obtained cocoa samples from across the tropics and extracted and sequenced RNA and DNA. They processed the information generated by that analysis into useful formats, developing a pictographic website with downloadable gene-expression matrix files backed up by comprehensive genetic data. The raw sequencing data is also available at the U.S. National Institutes of Health’s National Center for Biotechnology Information.

“We extracted RNAs and sequenced transcriptomes from 123 different tissues and stages of development representing major organs and developmental stages of the cacao lifecycle,” Thomas said. “In addition, we performed several experimental treatments and time courses to measure gene expression in tissues responding to biotic and abiotic stressors.”

To promote wider use of the information, the researchers also made all raw gene-sequencing data, gene-expression mapping matrices, scripts and other relevant specifics used to create the atlas freely available online. The researchers developed a gene-expression browser with graphical user interface to display gene-expression patterns and to provide easy access to raw data and statistical analyses.

Contributing to the research at Penn State were Craig Praul, director of the Genomics Core Facility at the Huck Institutes of the Life Sciences, Claude dePamphilis, professor of biology and Huck Distinguished Chair in Plant Biology and Evolutionary Genomics; Siela Maximova, research professor of plant biotechnology and co-director, Endowed Program in the Molecular Biology of Cocoa; and Lena Landherr, research assistant in plant science; former members of the Guiltinan lab at Penn State: Sarah Prewitt, U.S. Department of Agriculture’s Animal and Plant Health Inspection Service, Maryland; Akiva Shalit-Kaneh, Volcani-Agricultural and Rural Organization, Gilat, Israel; Eric Wafula, Children’s Hospital of Philadelphia; Benjamin Knollenberg, Mars Inc., Davis, California; and Noah Winters, Battelle Memorial Institute, Columbus, Ohio; and Eddi Esteban, Asher Pasha and Nicholas Provart, Department of Cell and Systems Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto.

Mondelez International, Inc. and the U.S. Department of Agriculture’s National Institute of Food and Agriculture supported this research.

Last Updated June 27, 2024

Contact