Monday, August 31, 2009

The Privacy Implications and Challenges of Google Book Search

As I have written about fairly extensively on this blog, Google has developed a rather adversarial relationship with privacy advocates. The company's seeming disregard for privacy becomes only more serious a problem as its size and scope continues to expand and the corner it has on market after market keeps increasing.

Now, I've posted a lot about Google's less than stellar record in the past, from their lobbying efforts in Congress, to cloud computing, and to its increasing usage and expansion of behavioral marketing techniques. My guess is that if you didn't already know all the "problems" Google has in this area you'll be rather shocked to find out.

Now with the soon to be launch of Google Books just around the corner, privacy advocates have once again been forced to spring into action due to fears the company will give privacy short thrift.

Already, the ACLU, Electronic Frontier Foundation, and the Samuelson Clinic have launched a Google Book Search privacy campaign - one component of the ACLU-NC's dotrights project.

So what is "Google Book Search" and why are privacy advocates so concerned?

The ACLU does a good job framing the issue: What you choose to read says a lot about who you are, what you value, and what you believe. That’s why you should be able to learn about anything from politics to health without worrying that someone is looking over your shoulder.

The good news is that millions of books will be available for browsing and reading online. The bad news is that Google is leaving reader privacy behind. Under its current design, Google Book Search can monitor the books you browse, the pages you read, and even the notes you take in the "margins." Without strong privacy protections, all of your browsing and reading history could be collected, analyzed, and turned over to the government or third parties without your knowledge or consent.

In other words, without strong privacy protections, all of our browsing and reading history could be collected, analyzed, and turned over to the government or third parties without our knowledge or consent.

We're not talking about just another library mind you - librarians utilize a different standards for dealing with user information than does the online world. Many libraries routinely delete borrower information, and organizations such as the American Library Association have fought hard to preserve the privacy of their patrons in the face of laws such as the U.S. Patriot Act.

The concerns of privacy advocates are not hypothetical - nor should they be discarded as paranoia. Our country has a long history of government efforts to compel libraries and booksellers to turn over customer records and information. Why would anyone believe, particularly after the warrantless wiretapping scandal, that the government won't ask a company like Google to turn over the treasure trove of private personal information it has on millions of Americans?

For these reasons and more, it is essential that Google Book Search incorporate strong privacy protections. Without such protections, we're talking about a virtual one-stop shop for government and third party "fishing expeditions into the personal details of our lives."

Again, these concerns are not hypothetical. Just three years ago the U.S. attorney subpoenaed Amazon for the used book purchase records of over 24,000 customers in the course of a grand jury probe investigating a single individual.

The good news was a federal judge agreed that Amazon should not have to turn over this information about its customers, saying that if word spread over the Internet that the federal government was probing book purchase information , “the chilling effect on e-commerce would frost keyboards across America."

If there ever was a time to make sure that Google doesn't put an end to reader privacy as we know it would be now. At present, all Google has done is make a lot of informal statements about privacy, while failing to provide an actual privacy policy with specific promises to consumers.

Criticisms of Google aside, there is reason to believe that citizen and organization pressure on the company can and will pay dividends. After all, it has taken extra steps to preserve privacy with other offerings, from blurring faces on Google Maps Street View to keeping records for Google Health users separate from other Google services to not keeping a log of user locations with Google Latitude (not to say there aren't still concerns with these products).

After the article I'm featuring today, I'll also provide a brief list (and link to) specific recommendations advocated by the ACLU and sent to Google in hopes they will be adopted by the company.

Now to the editorial by Leslie Harris, president and CEO of the Center for Democracy & Technology:

...offline, the right to read anonymously enjoys strong constitutional protection. For decades, libraries have protected the rights of readers to remain anonymous. Such anonymity is protected by the First Amendment and is a cornerstone of intellectual and political freedom. Almost all states have library confidentiality laws. The question is whether and how Google will honor these protections as it designs and builds Google Book Search and develops policies to guide its use of customer data.


Under the proposed settlement, Google will be required to collect a substantial amount of information about the people who use Google Book Search. Google will need certain information to control how much content users access electronically (in most cases, users will have access to about 20 percent of a book's content before they must pay) and to track royalties due authors and publishers, among other things.

Even taken in a vacuum, the idea of a massive database of readers, cross-referenced by their reading preferences, choices and activities, raises serious privacy concerns. But those concerns are magnified when considered in the context of the sensitive personal information that Google already collects and controls. Through its broad array of applications and services, Google has access to a great deal of user information.


Combining reader information with its existing database of user information would allow Google to add a rich and intimately personal dimension to its profiles that could become very attractive to marketers, litigators, the government and others with an interest in profiting from sensitive personal data.

It's easy to see how such an environment could easily lead to significant privacy exposures, especially given the absence of a comprehensive federal consumer privacy law.

Taking thoughtful steps to protect privacy now will help to ensure that Google Book Search lives up to its promise as a powerful social good, rather than becoming the next lightning rod in our ongoing national debate over privacy on the Internet.


First and foremost, Google must make absolutely clear to its users what information it is collecting, and how that information will be used. While such notice is a linchpin of all privacy policies, Google Book Search should strive to set a new bar for clarity and conspicuousness. Readers should know exactly what they're getting, and exactly what they're giving up in return.

The recommendations also call on Google to establish limits so that it collects only the information it needs to complete Google Book Search transactions. For instance, Google shouldn't have to collect or store significant information about how users are accessing books online (what pages they read, their annotations, etc.). Google's default position must be, "if we don't need it, we won't collect it."

It is also critical that Google limit how it uses the information it is required to collect about users. If such information is needed to calculate payments to publishers, then it should be used for that purpose and no other. Reader data is simply too sensitive to be lumped indiscriminately into online marketing dossiers.

Most importantly, Google should commit to take strong steps when others, including the government, demand reader information.


With the settlement hearing fast approaching, Google has an opportunity to set a high standard for online reader privacy that will set a precedent for all who follow: first, by publishing a strong privacy policy for the service that covers the full range of issues raised by privacy advocates and, second, by pledging adherence to that commitment in its filing with the court. With so many issues likely to be raised before the settlement judge, taking the privacy concerns off the table now is good for Google, as well as for readers everywhere

Click here to read the rest of the editorial.

For a bit more details on what a strong privacy policy would look like, I suggest you check out the ACLU's recommendations.

Here's a brief summary of what they have urged:

Protection Against Disclosure: Readers should be able to use Google books without worrying that the government or a third party is reading over their shoulder. Google must promise that it will protect reader records by responding only to properly-issued warrants from law enforcement and court orders from third parties.

Limited Tracking: Just as readers can anonymously browse books in a library or bookstore, they should be able to anonymously browse, search, and preview books using Google Book Search. Google must allow users to browse, search, and preview books without being forced to register or provide any personal information.

User Control: Readers should have complete control of their purchases and purchasing data. Readers must be able to review and delete their records and have extensive permissions controls for their "bookshelves" or any other reading displays.

User Transparency: Readers should know what information is being collected and maintained about them and when and why reader information has been disclosed. Google must develop a robust privacy policy and publish annually the number and type of demands for reader information that are received.

Stay tuned...I'll reporting on developments as they come to light.

No comments: