Listening to a twang in a person's voice can be a sure giveaway of where they live in the United States. Turns out those same dialects run abound on Twitter.
Researchers at Carnegie Mellon University's (CMU) School of Computer Science have recently found that regional slang and dialects are as evident in tweets as they are in everyday conversations.
Previously, studies of regional dialects have been based on verbal interviews. While written communication is less reflective of regional influences due to a level of formality that people assume, Twitter, on the other hand, offers a new way of studying regional dialects, as tweets tend to be informal and conversational. [Dead Languages Reveal a Lost World]
Jacob Eisenstein, a postdoctoral fellow in CMU's Machine Learning Department, said the automated method he and his colleagues have developed for analyzing Twitter word-use shows that regional dialects appear to be evolving within social media.
For their research, Eisenstein and his team collected a week's worth of Twitter messages in March 2010, and selected geotagged (added geographical identification to media such as photographs, video, websites, SMS messages or RSS feeds) messages from Twitter users who wrote at least 20 messages. That yielded a database of 9,500 users and 380,000 messages.
They discovered certain regionalisms that are already well-known and associated with specific areas of the country. For example, a Southerner’s "y'all," a Pittsburghers' "yinz," as well as the usual regional divides in references to soda, pop and Coke.
But other phrasing has evolved with social media itself.
In northern California, something that's cool is "koo" in tweets, while in southern California, it's "coo." In many cities, something is "sumthin," but tweets in New York City favor "suttin." While many of us might complain in tweets of being "very" tired, people in northern California tend to be "hella" tired, New Yorkers are "deadass" tired and Angelenos are simply tired "af," which stands for "as f***."
Eisenstein thinks some of this usage is shaped by the 140-character limit of Twitter messages, but geography's influence also is apparent. The statistical model the research team used to recognize regional variation in word-use and topics could predict the location of a tweeter in the continental United States with a median error of about 300 miles.
The automated analysis of Twitter message streams offers linguists an opportunity to watch regional dialects evolve in real time. "It will be interesting to see what happens," Eisenstein said. "Will 'suttin' remain a word we see primarily in New York City, or will it spread?"
Here's a list of some commonly used slang on Twitter.
- coo: cool – LA/Southern California
- fasho: for sure – LA/Southern California
- gna: going to – Boston
- iono: I don’t know – Northern California
- lames: lame people – Lake Erie Region
- koo: cool – Northern California
- lls: laughing like s*** – Washington D.C.
- od: overdone (very) – Lake Erie Region
- omw: on my way – LA/Southern California
- smh: shake my head – LA/ Southern California
- suttin: something – New York/Boston
- wyd: what are you doing – LA/Southern California
Eisenstein will present the study on Jan. 8 at the Linguistic Society of America annual meeting in Pittsburgh.