Monday 19 September 2011

Research Ethics and the Webs Private & Public Spaces

In a paper (Six Provocations for Big Data, section 5) related to her forthcoming keynote at the Oxford Internet Institute's "Decade in Internet Time" conference, danah boyd talks about "being in public" on the web, bringing metaphors about one's own public presence in a physical environment to bear on the accessibility of digital writings on a computer server. While we can all intuit what is meant by this (the conscious felt experience of being engaged with the web), are metaphors such as "being in public" helpful when thinking about ethical issues raised by the Web?

"Being in public" means that one's presence and actions can be seen/heard by other people, where we have no choice about who those "other people" are, nor control over what they do. Of course on the Web "we ourselves" are not in public, but the record of our words (or audios, videos, photographs, artwork) are. Or may be; sites may hide their content behind user accounts and secure browsing protocols. We may debate about our social networking activities being public, but we are rarely tempted to debate about the public nature of our bank account transactions.

What is the difference between the following:

Being in a public spacevshaving one's statements made public
Making a statement in a public spacevsmaking a public statement
Being in a public spacevsbeing on a global stage
Being in a public spacevsbeing in a particular space for a particular purpose that other people could observe now or in the future
Being in a public spacevsbeing made aware of other people's scrutiny
Making a statement in a public spacevshaving ones statements publicly analysed & criticised by observers

"Being in public" on the Web means that one's activities, memberships, engagements, writings, videos can be seen/heard by other people, where we have no choice about who those "other people" are, nor control over what they do. "Being in public" on the Web is useful on occasions when we want a global audience, and also on occasions when we are pontificating to the aether.

But "being in public on the Web" is also useful when we are expecting to speak to only a few individuals because for practical reasons it would be hugely inconvenient to create a specific channel for those people only. This is how we are "in public" normally: in parks, on the street, in coffee shops. One might refer to this as an expectation of "privacy by obscurity" - people could eavesdrop, but why would they bother? And when we are in those situations we are used to social norms that preclude people gathering around and gawping at our discussions. (As we are taught as children "don't stare", "don't be nosy", "that's none of your business".)

There are two phenomena that intrude on the unconsciously public: Google and the wily researcher. Search engines exist to expose and make things findable (more effectively public). However, those inhabiting the "self-conscious public" will often go to great SEO lengths to make sure that their public utterances are prominently positioned. Although not occupying key marketing positions in the top page of a Google search, the unconsciously public may still find find that their words are more accessible than they would have liked. 

However acting in an "unconsciously public" fashion does not necessary imply being completely oblivious to the lack of privacy. Individuals may adjust to the emerging social norms and in doing so create new norms and establish new boundaries of behaviour. You may consider it acceptable for like-minded individuals (friendly observers, benign lurkers) to search for your online presence on discussion forum; you may be unhappy about work colleagues, reporters, government agents and university researchers actively examining your opinions. 

So perhaps it is no small wonder that Google reports that there are almost half a million Web pages using the following boilerplate text threatening sociologists with legal action if they dare make use of their pages:
WARNING Any institutions or individuals using this site or any of its associated sites for studies or projects - You DO NOT have permission to use any of my profile, pictures, or other material posted on this site (including discussion thread posts and blogs) in any form or forum both current and future. If you have or do, it will be considered a violation of my privacy and will be subject to legal ramifications. It is recommended that other members post a notice similar to this or you may copy and paste this one into your profile
From a technical and legal point of view, I'm not convinced that this carries any weight (although I'm looking into it), but it certainly telegraphs a preference and intent. On the one hand we should feel a very strong pull towards respecting and honouring an individual's wishes, on the other hand we have clear social and legal boundaries precisely to curb our individual requests.

Should web mining personal information stop? Should ethics committees come down hard on this practice? Is it right to broaden the principle of "informed consent" to the Web, and to severely prune the availability of "big data"?  I don't know, but I do know that my engineer's default position of "do what you want with public web pages" has been severely challenged.