Digital Transformation

Facebook and self-censorship: How much of what you never publish is being monitored?

By Mark Leiser |

December 18, 2013 | 7 min read

A controversy involving Facebook is brewing after a report was published claiming that the social network tracks its users’ own self-censorship.

Report: How much of what people 'self-censor' is monitored?

Report: How much of what people 'self-censor' is monitored?

The report, written by a summer software engineer intern at Facebook and Ph.D student, Sauvik Das, and Facebook data scientist Adam Kramer, claims that posts written but never actually posted are tracked by Facebook and the report even suggests that they are monitored.

The pair carried out a study looking at the ‘self-censorship behaviour’ of five million English-speaking Facebook users – that is, anything written on the site that users either decide not to publish or later delete after publishing – by analysing text entered into any text box that reports metadata back to Facebook.

We all know storing text as you type happens routinely on other web services. Gmail, for example, stores the text one enters into an email in order to save it as a draft. Storing text as you type technology is nothing new.

In an article about the report published in Slate, University of Maryland professor Jennifer Golbeck highlighted three big concerns: whether Facebook can capture any information that you type, but inevitably don’t post under their privacy policy; secondly, whether or not Facebook should be able to collect information about thoughts that you never post; and thirdly, how do we regulate information collection that is thought, but never published?

The Slate story explained that Facebook wasn’t tracking the words per se in the study, nor was it collecting the text of self-censored posts, but it was collecting details of the data about the posts – also known as metadata.

How this works is quite technical, so I’ll try and explain it as best I can. When engaging with someone on Facebook’s chat service, Facebook knows (and reports to you) that the other person is typing. This is because of the way code is operating inside your browser.

Facebook is using the technology within the browser to report metadata back to Facebook on how you self-censor the post. The code is obviously designed to let Facebook know when the other person is typing back to the individual within their browser. Golbeck’s article in Slate suggests that Facebook collects metadata only on whether you self-censored, but not on what you typed.

However, the process means that theoretically Facebook could be gaining more access to information than “X started typing” and sending this back to Facebook’s servers. This begs the question: Are your thoughts available to Facebook for collection via the privacy policy?

Obviously, it is clear from the privacy policy that Facebook collects information on things users choose to share or when they “view or otherwise interact with things”. Arguably, viewing and typing a response before self-censoring things could be considered a form of interaction and would fall within the ambit of the policy.

However, most users would not expect the policy to allow collection of any sort to be used in the context of monitoring thoughts that are never published.

Should Facebook be able to collect any type of data about how we self-censor ourselves? Obviously Google is doing us a service by storing our emails to prevent losing data after accidentally shutting down the browser window or running out of battery, but users are fully aware that it is happening.

While this story will likely sound like a privacy advocates’ nightmare, from a different perspective this technology could be utilised to help drive advertising sales.

For example, text analysis might bring more curtailed and responsive advertisements based on our spur of the moment response to a friend’s outrageous political post. According to the self-censorship report, 71 per cent of users self-censored what they wrote (more men than women).

While Facebook already has much more sophisticated and specialised analytics than an average business site, they pay lots of people lots of money to make sure that they can figure out what ads to display to their users, and any technology that can advance that objective will no doubt be of interest to a commercial team.

Finally, how do we regulate ensuring privacy over things we collect, analyse, write, but ultimately self-censor? Is there any real difference between the data related to what we post and what we self-censor?

The report shows that people who self-censor more frequently tend to have more controls integrated over the audience that has access to the post. When Facebook’s privacy settings have been used to create groups, self-censorship happens a lot less frequently.

This suggests that privacy tools, when operational, result in less self-censorship. As more users start integrating privacy tools into their postings on Facebook, self-censorship occurs less often resulting in less metadata sent to Facebook’s servers.

So where does this leave us? As often with Facebook, questions remained unanswered. The technology exists for Facebook to monitor what people write without publishing, although Facebook insists they aren't looking at the text itself, just the metadata.

Interestingly, Facebook's privacy policy seems to cover this type of data grab - unless you are in the camp arguing that terms and conditions need to be clear and concise enough for every single contingency.

There is no doubt that Facebook, privacy and self-censorship will be in the headlines in 2014.

Digital Transformation

More from Digital Transformation

View all

Trending

Industry insights

View all
Add your own content +