Shakespeare Stored in DNA Files

Researcher Nick Goldman holds the DNA that encodes all of Shakespeare's sonnets, a photograph, and an mp3 clip of the famous "I have a dream" speech (Image credit: European Molecular Biology Laboratory)

Floppy disks, jump drives, DNA? Scientists have developed a way to encode music and text files into DNA, the molecules that normally hold the instructions for life.

The new method, described today (Jan. 23) in the journal Nature, is extremely expensive right now, but eventually it could be used to store digital files without electricity for thousands of years. And since DNA is so compact, vast amounts of data could be stored in one test tube, said study author Nick Goldman, a geneticist at the European Bioinformatics Institute in the U.K.

"I've gone from being a skeptic to a believer," said David Haussler, a geneticist and computer scientist at the University of California, Santa Cruz, who was not involved in the study.

And because DNA is the script of life, crucial in medicine, agriculture and other endeavors, human beings will always be pushing for ways to improve the reading and writing of DNA, Haussler told LiveScience. [Genetics by the Numbers: 10 Tantalizing Tales]

The team has even used the method to encode Shakespeare's sonnets.

Data deluge

From floppy disks to CDs to magnetic tapes, the technologies to store, read and write digital data become obsolete rapidly. Digital archives take a lot of space, and the files themselves, even archival magnetic tapes, need to be freshened up or rewritten every few years to prevent degradation.

Goldman and colleague Ewan Birney, also of European Bioinformatics Institute, were discussing this problem over beers one day when they realized that DNA might actually be feasible to store vast amounts of data.

As the discovery of intact woolly mammoth DNA demonstrates, the molecule can last for tens of thousands of years as long as it's stored in a cool, dark place, they said. It doesn't require electricity to maintain, like hard drives do, can include built-in error checking, and it's incredibly compact, Goldman told LiveScience. (Earlier this year, another team demonstrated the feasibility of DNA storage, but stored a tiny amount of data and didn't include error checking.)

Storage solution

The researchers began to sketch out a way to encode the 0s and 1s of a computer file into the alphabet of letters that make up the genetic code. They then chose several digital files ­— a portion of Martin Luther King Jr.'s "I have a dream" speech, all the sonnets of Shakespeare and a photograph of their institution — encoded them into DNA letters, and had a company in California called Agilent assemble short snippets of the DNA.

Because the method creates multiple, overlapping copies of each DNA snippet, the method also includes a built-in error-checking system. What they got back was a tiny amount of DNA, "an almost invisible fleck of dust in the bottom of a little test tube," Goldman said.

They then read the DNA-based files using a gene-sequencing machine. Using current technology, reading the DNA took more than two weeks and cost more than $10,000, Birney said at a press briefing. To store the world's existing data would be "breathtakingly expensive, perhaps costing more money than is on the planet," he said.

But the technology to read and write DNA has improved 10,000-fold over the last eight years and is likely to continue improving even more rapidly, Haussler said. In 10 years DNA could start supplanting magnetic tapes, which are currently used to store government and other long-lasting, rarely accessed archives, he estimated.

"You can't get obsessed with the fact that it may not be practical today. If you do any reasonable projection of current trends five or 10 years into the future you see that this is in the sweet spot."

Follow LiveScience on Twitter @livescience. We're also on Facebook & Google+

Tia Ghose
Managing Editor

Tia is the managing editor and was previously a senior writer for Live Science. Her work has appeared in Scientific American, and other outlets. She holds a master's degree in bioengineering from the University of Washington, a graduate certificate in science writing from UC Santa Cruz and a bachelor's degree in mechanical engineering from the University of Texas at Austin. Tia was part of a team at the Milwaukee Journal Sentinel that published the Empty Cradles series on preterm births, which won multiple awards, including the 2012 Casey Medal for Meritorious Journalism.