Are companies and government authorities using online data?
Credit: Computer photo via Shutterstock
Editor's Note: In this weekly series, LiveScience explores how technology drives scientific exploration and discovery.
A record of every Google search you do, every chat message you send and every item you buy may be saved on a computer.
For humanity's commerce and communication, life in the information age involves leaving an electronic trail. Most people now seek information from search engines, rather than libraries; correspond via email and instant messaging, not letters; and increasingly make purchases online.
Yet how companies store and use consumers' personal data remains somewhat mysterious.
Many people see the storage and use of their data as an invasion of privacy. But by privacy, "most people actually don't really know what they mean," said physicist Andreas Weigend, a lecturer at Berkeley and Stanford Universities in California and Tsinghua University in China and the former chief scientist at Amazon.com.
Weigend was at a conference in New Zealand recently when a woman approached him to tell him she had booked a flight from Auckland, New Zealand, to Sydney, Australia, and received advertisements for hotels in Sydney. She told Weigend the ads were an invasion of her privacy.
The woman's complaint is a common one, but people need to be clear about what they want, Weigend told LiveScience. "If they get perfect ads, they are annoyed. If they get awful ads, they are also annoyed," he said. Most people don't want ads at all, but then free services like Google wouldn't be possible, he said.
"The price of privacy, as some want it, is extremely high," Weigend said.
Your Google footprint
For many, Google has become sort of like a Swiss Army knife, a multipronged tool for finding and using information. Google's technologies are used for Web searches, email, chatting, photos, YouTube videos and countless other services. The search giant introduced Google Dashboard in 2009 to allow users to view and control which data associated with their Google account gets stored. [Super-Intelligent Machines: 7 Robotic Futures]
"It's important for people to be aware of what data they have online, and to be able to manage that data — Google Dashboard should help to make this a reality," Google software engineer Alma Whitten said in a statement at the time.
The scope of the information, which is stored only while users are logged in to a Google account, might surprise you. For example, you can see every term you have Googled, as well as your top queries; every place you have looked up on Google Maps; every Gmail message you have sent or received; every chat in which you have participated (if you have chat logging enabled); and every YouTube video you have ever watched.
All it takes for someone to access your digital biography is your Google password, although users can turn off logging, and delete stored data.
Still, could others, such as law-enforcement officials, gain access to this information?
Recently, police questioned an employee of a New York computer company whose work-computer search history included queries for "pressure cooker bombs" and "backpacks," the Associated Press reported on Aug. 1. (The Boston Marathon bombers are thought to have built their bombs using pressure cookers, and carried them in backpacks.)
The police questioned the man after receiving a tip from the company, but found no evidence of criminal activity.
Still, the case brings up questions about how to balance privacy and national security — a theme spotlighted by the court order leaked by former National Security Agency contractor Edward Snowden. [The 8 Craziest Intelligence Leaks in US History]
But even when there are no national-security issues suspected, there are others who might want your personal data: advertisers.
Creating advertising and targeting it at consumers based on their behavior is part of what's known as "business intelligence." Companies collect large amounts of data about customers, and use it to hone in on the products or services that might be most relevant to a particular demographic.
Many retailers keep track of customers' purchase histories to deliver customized advertisements. But some say that companies have taken this approach a bit too far. New York Times reporter Charles Duhigg wrote last year about how Target hired statisticians to determine when women were pregnant, in order to send them maternity ads.
Target was able to pinpoint when women were in their second trimester based on information such as a history of purchasing prenatal vitamins or maternity clothing. The idea was that by marketing to these women before their babies were born, Target could win their loyalty for years to come.
Target tracks customers using their Guest ID numbers. “If you use a credit card or a coupon, or fill out a survey, or mail in a refund, or call the customer help line, or open an email we've sent you or visit our website, we’ll record it and link it to your Guest ID,” Andrew Pole, a marketing manager at Target, told The New York Times.
Facebook allows advertisers to display their ads to a target audience based on demographic factors such as location, age, gender, education, work history or interests people list on their profiles. Advertisers don't know the identity of the individuals who see their ads, but rather only their basic descriptors.
But Weigend said business intelligence is a thing of the past. The approach relied on something called segmentation — putting people into buckets like "Midwestern soccer mom," he said. Consultants thought that simply gathering data about people would provide "amazing actionable insights," but the actions were missing, he said.
Customer knows best
Now, business intelligence is being replaced by "customer intelligence" — intelligence not about the customer, but by the customer, Weigend said. Rather than passively bombarding customers, companies are letting customers voluntarily contribute information.
For example, at Amazon.com, one goal is to help customers make better decisions, Weigend said. Reviews are the most straightforward example, in which customers can read and write feedback about the company's products.
Amazon also uses two approaches to generate suggestions for consumers about items they may want to buy. Data on clicks from hundreds of millions of purchases is used to determine which items are most similar to an item that a customer is looking at, while suggestions for items "frequently bought together," come from information about actual purchases.
Importantly, these suggestions do not come from data about people who are similar to you personally. "It has nothing to do with whether that person's like you or not," and has everything to do with clicks, Weigend said.
Weigend said he thinks that most fears about how advertisers use personal information are unfounded.