Skip to main content

Library of Congress to House Entire Twitter Archive

The U.S. Library of Congress, which archives many forms of media for their cultural and historical significance, has announced it will keep a digital archive of every public tweet that has been broadcast on Twitter since its inception in March 2006.

It's only appropriate that the initial announcement of this project was given on the Library of Congress' Twitter account (@librarycongress) and was followed up by a Facebook message before the official press release is issued.

Even though tweets, as messages on Twitter are called, can only be 140 characters long, the amount of information to archive is significant. There are 50 million tweets per day and the total number of tweets already number well into the billions.

The Library of Congress plans to focus on the "scholarly and research implications of the acquisition." Certainly the daily thoughts of millions of people worldwide would make an excellent source of sociological information.

Recognizing that the inane tweets will certainly outnumber the significant ones, the Library of Congress plans to highlight the culturally and historically important tweets, such as the first-ever tweet sent by Twitter co-founder Jack Dorsey, President Obama's tweet announcing his win in the 2008 election and a set of tweets that helped a photojournalist get released from prison in Egypt.

This Twitter archive isn't evidence of a new focus for the Library of Congress; it has been collecting and archiving websites and online media for a decade now. The Library of Congress currently houses 167 terabytes (or 167,000 gigabytes — the largest iPod storage is only 64 gigabytes) of information pulled from the Internet during that time.

As the Library's Facebook announcement says, "If you want a place where important historical information in digital form should be preserved for the long haul, we're it!"