code data development free speech security

Ethics in technology

This is an extension of a twitter thread I wrote in response to this tweet, and thread about the Cambridge Analytica revelations.

One of the key modern problems is how easy it is to access these tools. You don’t need professional training to string these together.

It’s as dangerous as if someone invented a weapon that could kill 10s or 100s of people, light enough to carry anywhere, and available in any store, without training. And expecting owners to police themselves.

People are terrified of AI. We know we don’t need AI to disable hospitals. We don’t need AI to intercept Facebook logins (although FireSheep and the pineapple are less effective now). We don’t need AI to send a drone into a crowded market.

Make a website the only place for government applications, such as medicare or millennials railcards and it’s easy to remove access for all citizens.

But combine all that with data and you can fuck up someone’s life without trying. You can give 2 people the same national insurance number or other id. You can flag them on the no fly list.

You can encode prejudice into the algorithm and incarcerate someone because they grew up in a black neighborhood.

The algorithm is God. The algorithm is infallible. Trust the algorithm.

Even when it tells you someone is more capable than the humans says she is, and punishes them.

(unless you’re under GDPR where you have the right to question the algorithm)

But tell anyone that people will use data for purposes they hadn’t considered (like using RIPA anti-terror legislation to see if someone’s in the school catchment area) then you’re paranoid.

Be paranoid. People will always stick crowbars in the seams. Whatever your worst case scenario for your code is, you’re probably not even close.

You can see my original tweet, and the repies, here:

The Guardian has a great interview on AI, existential threats and ethics on their podcast here.

data security

Privacy is not your only currency 

If you’re not paying, you’re the product.

But you’re not. In security, we talk about 2-factor authentication, where 2 factor is 2 out of 3 : who you are, what do you know, and what do you have. Who you are is the product, a subset of a target market for advertising, or a data point in a data collection scoop. The former requires giving up privacy, the latter less so.

Advertising is about segmenting audiences and focusing campaigns, so views and clicks both matter, to feed into demographics and success measures. Ad blocking is a double whammy – no ads displayed, and no data on you. Websites tend to argue that the former deprives them of revenue, many users argue that the latter deprives them of privacy.

What you have is money, and who you are is part of a demographic than can be monetised in order to advertise to you to get your money.

But what else do you have? If you’re on the web you have a CPU that can be used to compute something, whether it’s looking for aliens or looking for cancerous cells. If you’re happy to give up your CPU time.

Who else are you? You might be an influencer. You might be a data point in a freemium model that makes the premium model more valuable (hello, LinkedIn).

What do you know? If you’re a human you know how to read a CAPTCHA (maybe), you could know multiple languages. Maybe you know everything about porpoises and you can tell Wikipedia.

Your worth to a website isn’t always about the money you give them, or the money they can make from selling your data. It’s the way we’ve been trained to think, but there’s so much else we can do for value.

code data programming ux

#dunddd Analyse This : The dangers of big data

Thanks to everyone who came to my DunDDD talk. Lots of interesting questions, although I’m not a lawyer so couldn’t answer them all.

If you want the slides, with references in the notes, you’ll find them here. All the images are creative commons, and you can use the sides yourself under CC by Attribution. Link to slides : Dunddd Analyse This – The Dangers Of Big Data (Google Drive)

If you missed the talk, the arguments I made and the references, apart from the privacy sections, are in this

Link to previous post

If you want the references for the Personal Data and anonymisation parts, have a look at these :

AOL searches are not private

IBM privacy-preserving data mining