Your ordinary pine tree is an extraordinary genetic jungle.
Scientists at University of California at Davis have sequenced what they report is the largest genome sequenced to date—that of the loblolly pine tree, or Pinus taeda , a common sight throughout the southern United Sates. Its genome clocks in at a whopping 22 billion base pairs. That’s seven times the human genome’s 3 billion base pairs. It’s not quite as large as the Japanese flower Paris japonica , which though it hasn’t been sequenced is estimated at 149 billion base pairs.
The scale of the loblolly’s genome means this is quite the accomplishment. Here’s Victoria Turk, writing for Motherboard :
The problem is that in conventional genome sequencing you get small fragments of the letters in DNA, but they all have to be assembled in order. They used a
new method developed by scientists at the University of Maryland which pre-processes some of this data to effectively compress the sequence data 100-fold. It’s the first time the novel approach has been tested.As well as being a test subject for the new technique, the loblolly genome is of a lot of interest given the tree’s starring role in the US landscape, both natural and commercial. It’s the second most common tree species in the country, and it’s a major player in the lumber and paper industry.
An understanding of the loblolly’s genome could help biologists isolate certain genes associated with fighting pathogens, which could foster the development of disease-resistant trees. It could also fill in our knowledge of the reasons for genetic diversity in plants—especially conifers. Just last spring, another gymnosperm, the Norway spruce, topped the charts with a 20-gigabase large genome.
Still, the question lingers: are there any evolutionary advantages to having such gargantuan genomes? In the case of the loblolly pine, size isn’t necessarily an indicator of complexity; 82% of the loblolly’s genome is just repeated information.