Scientists say they've generated the longest genome sequence to date, unraveling the genetic code of the loblolly pine tree.
Conifers have been around since the age of the dinosaurs, and they have some of the biggest genomes of all living things.
Native to the U.S. Southeast, the loblolly pine (Pinus taeda) can grow over 100 feet (30 meters) tall and has a lengthy genome to match, with 23 billion base pairs. That's more than seven times the size of the human genome, which has 3 billion base pairs. (These pairs form sequences called genes that tell cells how to make proteins.)
"It's a huge genome. But the challenge isn't just collecting all the sequence data. The problem is assembling that sequence into order," study researcher David Neale, a professor of plant sciences at the University of California, Davis, said in a statement.
To simplify this huge genetic puzzle, Neale and colleagues assembled most of the sequence from part of a single pine nut— a haploid part of the seed with just one set of chromosomes to piece together.
The new research showed that the loblolly genome is bloated with repetitive DNA. In fact, 82 percent of the genome repeats itself, the researchers say.
Understanding the loblolly pine's genetic code could lead to improved breeding of the tree, which is used to make paper and lumber and is being investigated as a potential biofuel, the scientists say.
The loblolly pine joins other recently sequenced conifers, including the Norway spruce (Picea abies), which has 20 billion base pairs. For their next project, the researchers are eyeing the sugar pine, a tree with 35 billion base pairs.
Sign up for the Live Science daily newsletter now
Get the world’s most fascinating discoveries delivered straight to your inbox.