WWW2010 : danah boyd

danah boyd is with Microsoft Research New England and a research fellow at the Berkman Center for Internet and Society at Harvard University; she was one of the first people to research social technical networks of teens

Privacy and Publicity In The Context of Big Data

Privacy

  • What are teens doing
  • Technical (phising etc)
  • Concerns are not new – long before the Net
  • What is new has to do with big data

Big Data

  • Remix, aggregate – average people are producing data
  • Specifically referring to “social data” – people, their activities, their interactions, their behaviors [FB, Twitter]

Sense-Making

  • Data is cheap; analysis is not
  • What does it mean to engage with this data? Ethics, privacy, publicity

Methodology

  • WWW is one of the places where we see big data being researched
  • Ethics : ethnography – map cultures, figure out what people do. I started out in computer science, visualizing social networks. I wanted to understand why.
  • Georeg Homans : the mthods of social science are dear in time and money and getting dearer every day
  • Vint Cerf: We never, ever in the history of mankind have had access to so much info so quickly and easy

* Bigger data are not always better data
* Not all data are created equal
* What and why are different question
* Be careful of your interpretations

SAMPLING

  • Quality more impt than quantiy
  • big-ness != whole-ness | Twitter has all Twitter but most researchers only have a sample. If you are trying to understand topicalness, your analysis will be wrong no matter how large your sample is
  • know your biases – if you have all tweets and can pull a random sample, it is a random sample of tweets not users. some users have multiple accounts, some consume without accounts, some tweet more than others
  • not all data are created equal
  • “better” networks
  • different kinds of social networks : articulated, behavioral, personal … articulated (people you list off on FB etc — lots of people here who are not your “closest” friends — you can’t FB friends list and say you’ve analyzed a person’s social network), behavioral (same room, email, cellphone) neither are the same as personal networks
  • Homopholy has been shown to play out in all of these in interesting ways
  • “Tie” strength — the person I list as my top friend may be there because of politics; just because I talk to my collaborators more than my mom does not mean that they
  • No one loves big data more than marketers and no one misunderstands big data more than marketers [coke/coca-cola and teens example – linked for different reason than what coca-cola thought]

What and Why are not the same thing!

  • Cobot + Lambdamoo
  • Frequency is not tie strength (drama over misinterpretation)

Interpretation

  • Fallacy that qualitative research is interpretation and quantitative research is about producing facts
  • Robin Dunbar’s work -> you could only keep up with gossip of 150 people max. Friendster interpreted this as saying people would have no more than 150 friends and capped it there
  • Hardest part of research is interpretation – it’s why social scientists are focused on methodology

The #1 threat to privacy today is our focus on big data

We have perverted the notion of public … [find this quote]

Just because data is accessible doesn’t mean that using it is ethical — privacy is contextual — people trust each other to maintain context and they feel violated when this is broken

“These walls have ears” — Chaucer (1387)

People believe that they understand the context in which they are operating and they get upset when this changes – FB – technology DOES have eyes and mouths

Big data isn’t arbitrary data – it is data about people’s lives — the process of sharing and using

* security through obscurity is reasonable

  • how we act in public spaces depends on context
  • often ephemeral
  • surveillance cameras capture cartwheels
  • people change behavior when they know that they are being recorded
  • mediated situations like FB but amplifying is different — people are developing skills
  • but when we keep changing the context people get confused — people’s encounters with social systems
  • civil inattention [get this quote] you may be able to stare at everyone who walks by, but you don’t. why is ok to demand the right to stare at everyone online just because you can

* not all publicly accessible data is meant to be publicized

  • using a sense of obscurity for context
  • paparazzi make lives of celebrities hell
  • when we argue for the right to publicize any data that are successful we are arguing that everyone life the life of a celebrity in this sense
  • psychological consequences?
  • PII – personally identifiable information
  • PEI – personally embarrassing information
  • aggregating and distributing data out of context is a privacy violation

* Privacy is not access control

  • limiting access can be one tool but it’s not the same as privay

* challenging questions

  • “should we?”

* publicity

  • hard to distinguish between content meant to be aggregated and that which isn’t (context)

* data are people

FACEBOOK

  • Built its reputation on being a closed system
  • Interpreted by the public as “anti-myspace” – narrated as closed, intimate – “safer” because it was more private
  • First impressions matter
  • To this day, many of the average FB users I talk to believe it is about privacy
  • Students Against Facebook News Feed -> already accessible but it took implicit content and published as explicit content — changed the context and people wanted to opt-out — it changed peoples’ behaviors — they thought differently about how — those who joined after 2006 have different norms
  • Beacon – 2007 advertising messages — made individuals advertisers for their friends — opt in by default — not as visible as news feed – only learned about it when something went wrong — dismantled and settled law suit
  • Last year – “invite” to change privacy settings — the default choice was /everyone/ — for most people, they just clicked through. FTC challenged. 65% of users made their content publicly accessible (researchers know most people don’t change defaults)
  • I asked people to describe their privacy settings and then we went through their settings. Not a single person I interviewed had a mental map that
  • clueless, confused and outright screwed (proverbial boiling frog)
  • last week announcement – it relies on the changes from FB (a form of trickery)
  • you have to opt-out of each individual partner site on FB and the partner site
  • I spent six hours trying to figure out how to turn off like for politicians

CONTEXT

  • opt-out is in the better interest of companies not people
  • people are engaging with FB for personal reasons — huge ethical issue/challenge

NOT ABOUT HIDING

Regulation – Lessig – Code

  • Changes are coming about because of changes in architecture, changes in code
  • Technology’s role as regulator has rapidly changed
  • Social norms haven’t radically changed
  • Law is interesting player in all of this

GET INVOLVED

Advertisements

2 thoughts on “WWW2010 : danah boyd

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s