data free speech programming security

The uncrackable back door : The intersection of mathematics, AI, politics and ethics

The following is a lightly edited conversation I had with a tech-savvy friend who is not in IT. It was about the FBI trying to break the encryption on an iPhone so they could access potential information on criminal activity, but in light of the UK government seeking to add backdoors to all messaging platforms, for much the same reason, I thought it was a good time to revisit the arguments.

My friend’s comments are quoted, and the unquoted text is mine.

Imagine a technology existed that let you send things via post and to you and everyone else it looked like an envelope, but to the NSA it looked like a postcard, and they could read everything.

How does the NSA prove it’s them? How can we trust them? What if the FBI or your local police force can pretend to be the NSA? Couldn’t criminals, or your stalker ex do it too?

Maths doesn’t deal with legal balance. Either you let everyone in, or you let no one in. That’s the political choice. Is getting access to this phone more important than keeping other governments, such as China or North Korea out of phones they are interested in?

I don’t know if it’s an all or nothing situation though… are we saying that the courts shouldn’t be able to force entry into criminals data? Or are we saying that all data should be accessible to all outside existing privacy laws?

Think of the Enigma code. Once it was broken, Bletchley Park knew most of what the military was doing. If the Nazis knew it was broken, they’d have stopped using it, and all the work would have been for nought.

Enigma is a great example of why the code needed to be broken in the first place. That’s a chicken and egg scenario. But also a really interesting point! What if an iPhone is enigma, and say GCHQ cracked it. Would the evidence be allowed in court?

Is it not the case of Apple granting access to specifc phones; not being given the technique to do so?

What I’m worried about is the fact that big companies could hold justice and common law to randsom: that to me is equally as worrying as big brother, if not even more so. We can “elect” governments, and they can pass legislation to create international privacy agreements (as what Snowden’s revelations led to) We can’t elect Apple and I detest how Apple seem to be influencing justice; that is a very very bad sign of things to come.

Don’t even get me started over how data protection doesn’t exist between companies any more. Logon via Facebook anyone?

Is it not the case that Apple can access all this data anyway? So does Apple not have an ethical responsibility to disclose evidence for an individual case that has a court request attached to it? Guess not. Is that an appropriate level of power a company should have? To dictate what can and can’t be shared with courts?

Corporations already have too much power in the world. By not establishing a legal framework of when it is appropriate for a court order to be issued and have access (e.g to break and enter) we are basically letting sometimes serious criminals have a get out of jail free card. And that includes tax dodgers like Apple.

Apple can’t access the data at the moment, that’s the point. It only exists on the phone, encrypted with a key that’s password protected with a password only known to a dead guy.

Interesting. So none of his data was stored on Apples / 3rd party servers and it was all encrypted on the phone? What about all his comms traffic.
If I encrypt my (ah hem) Google Android phone, does that mean that my emails can’t be viewed by Google?

A lot of this comes down to trust. I don’t trust our govt nor the govt of others, but equally I don’t trust Google or Apple.

He switched off iCloud sync so it was all on his phone. However, as it was government issue, they could have changed that via policy if the FBI hadn’t tried to change the iCloud password, and hence locked the phone out of the government domain.

So they got locked out. That’s hilarious.

What I tend to do these days is try to remove my mind from the broader political implications and think about things at a ground level then I thought…. what if a phone contained information related to the death of my loved one.. then I realised there should be a controlled process in place to retrieve data legally and transparently.

I think the broader implications are important. If they can do it here, where else would it apply?

We have to think of real world scenarios : a murder in Glasgow, a child missing, that type of thing

Look at councils using anti-terror legislation to catch petty criminals, or DSS using it to follow people on benefits.

Imagine an encrypted padlock to a cabinet containing murder weapons.

Who watches the watchmen?

That’s conspiracy speak Craig. If we don’t trust the courts… then who can we trust?

It’s recorded activity. It’s not conspiracy if it actually happened.

courts are separate from government. They have been in Scotland since 1748.

I trust the courts. The problem is that many of these powers bypass the courts.

DSS is rarely a court matter.

Yes, but they are doing so illegally and that’s why new laws are coming in

And a backdoor for one is a backdoor for all. If the FBI have a post-it note with the pin for that murder weapon safe, it only takes one photo for everyone to have access.

The FBI is not the UK. We cannot control what Israel does but what we can do is create controls for the UK. so… if my loved one is killed, and there are photos on the phone.. then of course the police should have access! It’s a no brainer

True, so why would we want a situation that increases the risk of Israel, or North Korea, having the means to access something that sensitive?

What’s sensitive exactly? They don’t care about normal users!

Even if it means Journalists at News Of The World can also gain access to those photos?

That’s illegal! As is breaking and entering.

It didn’t stop them last time.

Yes.. and look what’s happened.

They renamed it to the Sun on Sunday, and carried on as normal?

Come on…. I’m saying that only the courts can have access.

Being illegal doesn’t stop things from happening. That’s why we lock our doors and fit burglar alarms.

and besides… they cracked the iPhone anyway!

That’s not how maths works.

Life isn’t maths. Life is ethics. Ethics are not maths

Yeah, there’s an Israeli company that will break into iPhones for anyone who pays.

What Israel does is up to them.

No, but encryption is maths.

But retrieving data is an ethical issue. It’s not black and white. It’s about appropriate use of powers

Like knowing when to put someone away for life, or releasing them in 10 years

It would not be acceptable for police to hack my phone without just cause, but it would be acceptable if they suspect me of plotting a terrorist act.

I agree, but when access to the data cannot be done without compromising everyone’s security, we have to ask where to draw the line?

We draw the line through the law.

CCTV inhibits crime in those areas, but we accept that it’s creepy to allow it in bathrooms.

Exactly. …There are laws regarding the use of CCTV

And many offices do not have CCTV inside because the risk of losing sensitive data is higher than the risk of crime.

You can only film in your property. That’s the law. But.. of course there is a difference between private companies and local government. And that’s where PFI come in….

Plenty of public CCTV as well

Not here there isn’t

Depends where you are, agreed.

There’s a camera on the bus.. I think, and at the primary school, maybe one in the shop…. but I don’t think big brother is watching when they can’t find muggings taking place at the Broomielaw!

That’s about effectiveness though.

Google is the one to watch

And Facebook

Yeah… but Facebook has countless terrorist pages funnily enough. So they can’t even monitor effectively. Let alone GCHQ.

Depends who has the most effective Algorithms. We don’t know what GCHQ is capable of. Just ask Snowden.

You know fine well it’s not about monitoring – it’s about textual analysis – patterns – heuristics. GCHQ is trustworthy. I have no problem with them whatsoever.

That’s cos you’re not Harriet Herman, or a union activist.

I really don’t, maybe I am naive, but I’m not scared. If I want to disconnect all I have to do is switch off the router and remove my sim
oh and stop using my bank card
and then become a missing person…

Not GCHQ, but …the police faced hard questions about covert monitoring of Jeremy Corbyn and other MPs

Well that’s not surprising. This has nothing to do with encrypted phones.

That security services were monitoring privileged conversations of individuals not suspected of criminal activity?

Does that come as a surprise? They may as well just have attended a meeting.

No. But it shows trusting the courts is naive when it comes to backdoors

Attending a meeting is enough to put you on a watchlist.

This is not the same as getting access to evidence for a crime that has taken place. If you want secrecy, you can meet in the woods. It’s very simple…

Sorry, but I do trust our system of justice.. I don’t necessarily trust the government and I certainly believe that there should be water tight controls that allow for breaking and entering into criminals data. And that includes data from corrupt politicians. It works both ways.

Digital forensics is a thing… with impossible encryption the whole thing falls down

Now… I like encryption… especially for B2B, but Apple are not gods! And private companies should never be above the law. If we let private companies rise above the law, we will be in a much worse situation than we are now… it’s already bad enough with tax avoidance.

It’s not about being above the law. It’s about a clear standard, and if police don’t have evidence to hand, they have to collect it. Sometimes cameras are broken. Sometimes weapons are lost, and sometimes you can’t get access to encrypted data.

They can only legally collect evidence if they have sufficient knowledge of a criminal activity.

And they have ways to set up intercepts in those cases, without physical access to the phone

Further Reading

Bill Gates say Apple should unlock the iPhone for the FBI

Feds ordered Google’s help unlocking nine Android phones since 2012

Troy Hunt: Everything you need to know about the Apple versus FBI case

Apple’s FBI Battle Is Complicated. Here’s What’s Really Going On

Continuing the Conversation About Encryption and Apple: A New Video From Mozilla

Encryption keeps us safe. It must not be compromised with ‘backdoors’ | Robby Mook

Open rights group: who’s checking on your chats in private online spaces?

development security

Flatter Data

I was watching The Verge summary of The Selfish Ledger, Google X’s thought experiment on what your personal data could do in the future. I started to think about Flatland.

Flatland is a book by Edwin A Abbott about dimensions. In the book, A Square lives in a 2D world, with other 2D shapes, and tries to comprehend the universe when 3D shapes start turning up, but A Square can only comprehend them in slices or shadows/projections.

See this video by Carl Sagan if you want to know more.

The personal data organisations see of us is like the circles projected in Flatland. Google sees the videos I like and the technologies I search for help on. HMRC sees my income, savings, and charitable giving. NHS sees my health.

Companies make decisions on this data, and, like the flatlanders, generalise from the pink circles they see. Sometimes that accurately reflects the brown circles, oftentimes, not. Sometimes what looks like 2 circles is a pair of legs, and what looks like one circle is actually a group hug.

I don’t want companies to disambiguate that. I endorse the spirit of GDPR, that data should only be given up in informed consent (absent the usual rights exemptions for criminals who who violate the rights of others.)

For those of us who work in tech, we need to embrace the ambiguity, and help users and other data subjects understand how they have been categorised. Let them embrace anonymity via randomisation, such as number variance data masking.

You never own someone else’s data, you merely look after it for as long as they let you. It’s not about privacy. It’s not about data. It’s about trust. It’s about ethics.

development security

How much data can you lose before you’re in trouble?

Ransomware is a very aggressive attack. Whilst many espionage operations are about sneaking in and copying data without your knowledge, ransomware hits you over the head with a hammer to let you know you’ve lost your data. It’s not theft, it’s extortion.

The big pro is that at least you know you’ve been breached, and the form of attack means that whilst you might not have access to your data, the bad guys might not either.

But you’ve got a good backup strategy, right? You can roll back the data to a known good point in history, and maybe even roll forward your changes from there.

But maybe it doesn’t matter. Maybe you can run your business just as effectively without that data, or those templates. Maybe you shouldn’t be keeping that data at all?

If you have data you need, distribute it. Secure it, but decide if the greater risk is you losing access to the data, or someone else gaining access.

If you have data you don’t need, Don’t store it.

development security

Primer : A tech view of GDPR

I was fortunate enough to attend an event at The Data Lab in Edinburgh today on the new General Data Protection Regulation, coming to the EU and the UK. There were 4 talks from a variety of angles, but for me the key takeaways were that the primary thrust of the regulation is about prevention rather than cure, and auditing and control rather than additional technical implementations, aside from the Data Portability clause.

Best practice still applies. Collect only the minimum data required, and don’t collect personal data unless you have to. Encrypt your data, in transit and at rest. Privacy should be the default, and only extended by informed choice.

But you need a data breach policy. An email to Troy Hunt might be OK if it’s a hobby project that was breached, but you need to notify data subjects and users if there is a breach, and you need the security policies and audits to protect you if the lawsuits start flying.

I’m not a lawyer, so I won’t offer advice there. But as you’re designing your systems, now’s the chance to audit, prepare and secure. Don’t be the first high-profile fine under the new rules.

february 14 2017 at 0237pm
february 14 2017 at 0237pm

dsc 0437
dsc 0437

dsc 0438
dsc 0438

dsc 0439
dsc 0439

data security

Privacy is not your only currency 

If you’re not paying, you’re the product.

But you’re not. In security, we talk about 2-factor authentication, where 2 factor is 2 out of 3 : who you are, what do you know, and what do you have. Who you are is the product, a subset of a target market for advertising, or a data point in a data collection scoop. The former requires giving up privacy, the latter less so.

Advertising is about segmenting audiences and focusing campaigns, so views and clicks both matter, to feed into demographics and success measures. Ad blocking is a double whammy – no ads displayed, and no data on you. Websites tend to argue that the former deprives them of revenue, many users argue that the latter deprives them of privacy.

What you have is money, and who you are is part of a demographic than can be monetised in order to advertise to you to get your money.

But what else do you have? If you’re on the web you have a CPU that can be used to compute something, whether it’s looking for aliens or looking for cancerous cells. If you’re happy to give up your CPU time.

Who else are you? You might be an influencer. You might be a data point in a freemium model that makes the premium model more valuable (hello, LinkedIn).

What do you know? If you’re a human you know how to read a CAPTCHA (maybe), you could know multiple languages. Maybe you know everything about porpoises and you can tell Wikipedia.

Your worth to a website isn’t always about the money you give them, or the money they can make from selling your data. It’s the way we’ve been trained to think, but there’s so much else we can do for value.

code data development programming security

Enforcing ethics

I was reading IOT: Code of Ethics for Software Developers and Engineers – Secret Microsoft Communications – Site Home – MSDN Blogs today and it got me thinking about the Botnet of Things, but more importantly, about ethics in Professional Development, as covered in the DunDDD open discussions.

The MSDN blog covers an ethical scenario well, so I don’t want to go over that again, but it got me thinking about something that I’ve been asked to do a few times, that takes the idea one step further.

I’ve been involved in a number of projects that handle sensitive data, particularly data on children, data on prisoners and sensitive financial data, so data protection is key to much of what I have built. In order to illustrate some of the additional ethical considerations when dealing with data, I’m going to discuss a scenario that doesn’t relate to a specific client, but covers many of the decisions that I have had to deal with, and I hope is a scenario familiar to many of you.

The ethical workflow

Consider a accountancy firm, with many clients. As a result of this, time tracking is very important to their business, so that they can bill clients appropriately. The scenario I want to present considers the timesheet software in use. At a basic level, there is a client code, a number of hours per day booked to that client, and an approval system so that the hours are checked following submission, before any invoices are sent out.

In addition, the timesheet software records overtime, and each users’ financial details, so that it can correctly pay each employer each month.

The software solution

The data entry portion validates that as a user, I only have access to a subset of the client codes, that I can only book my contracted hours to standard codes. The workflow ensures that a manager, as someone authorised to check work for a given client code, can authorise my time. The workflow also ensures that invoices cannot be generated until the time has been authorised.

This workflow is similar to many in systems I have designed. There is a validated data entry, which prevents the workflow from starting if the data entered is obviously incorrect, and a workflow that ensures the data is checked before it is used in a process with financial impact.

Ethical trapdoors

To truly be an ethical developer, you need to consider both the implicit and the explicit ethical considerations within the requirements, and the behaviour of the less ethical users, who may attempt to subvert the ethical process either due to malice, or laziness, or a myriad of other reasons.

Manager, authorise thyself?

Hopefully, the first potential ethical problem with this workflow is obvious to you : I have yet to mention any restriction who can authorise a timesheet. Should the user entering the timesheet also be a user authorised to access the client codes on the timesheet, they will be authorising themselves, offering no additional protection.

It might be the case that the user has been given authorisation because they have proven that they maintain high ethical standards, and would therefore be less likely to cook the books. If you believe in people over process, this might lead you to think this way. If, however, time pressures on individuals are such that the authorisation time is limited, there may be scenarios where a user would limit their diligence, increasing the change of deviation between the recorded and actual figures. There may also be unethical figures who are able to provide the facade of ethical competence in order to get the authorisation required.

Data leaks

Certain clients will be sensitive, either by means of celebrity, or association with staff, such as ex-husbands. Whilst their records will be recorded with sensitive data, satellite systems, such as invoicing and time tracking, may not be aware of their sensitivity. So, to ensure anonymity and enforce ethics via obscurity, the client codes should never leak information, either directly or indirectly (i.e. direct the user to an external resource that might contain sensitive information that can be exploited), and should only be visible to users with a valid reason to see them.

Software supports the business

Ultimately, the software exists because the business needs it. So the ethical decisions sit within those guidelines. The software can’t do everything, so the external processes have to be considered, and questioned where they allow ethical breaches that the software cannot counter. We have a duty to recognise the limits of where our software can enforce ethical behaviour and document these limits so that our customers can adapt or strengthen their processes appropriately. We also have a duty to challenge requirements and requests that violate or ethics, or the ethics our clients declare they follow.

code data development programming security

Personal Identity in a digital world

In the light of more data breaches, especially highly personal data of the form held by a certain affair website, I wanted to revisit the 4th Rule of network security. If you don’t trust encryption, you store as little as possible which implies YAGNI for data storage. There’s a few other benefits you get as well, in simplifying your storage. In this post I want to focus on the simplest and most obvious personal identifier, your name. And I’ll assume you’ve all read Falsehoods Developers Believe About Names, so if you haven’t yet, read it now.

What’s in a name, or a person?

At its basic level, a name is a poor identifier. Going by Google and some misplaced emails, I know of at least 3 other people in Scotland, 1 in Australia and 2 in USA who share my name. So my name is not sufficient to uniquely identify me, which is why I have a number of tracking references for the NHS, National Insurance (for Americans, think Social Security Number), my employer and others, and why it’s hard to get my name on social networks or email providers unless I get in early.

Do you need to know anyone at all? Is a tracking id enough?

Given that your name is not sufficient to uniquely identify you (and therefore is also harder to verify), is it even necessary? For many sites, and apps, not even a login is necessary or desirable to users, so advertisers, content providers and others often just use a tracking cookie or similar to identify you again the next time they see you.

If they need a login is a username and password enough?

OK, but your software needs to verify users and store data about them. You’ve got a good authorisation and authentication story to allow users access to their own stuff and no one else’s, without permission.

Do you still need their name? If you are a music site tracking my favourite videos, why do you care who I am? And if you’re built on a shoestring budget, why would you want the hassle of securing password when you can grab an OAuth library and not have to worry about it at all. Some of them will give you a name, some won’t. What purpose does storing a name, real or otherwise, serve?

Do you actually need to split names into first and last?

OK, so you actually need someone’s name, how are you going to store it? A name, a first and last name, middle names? If you are splitting a name and then littering your code (or at least your CPU time, if you’re resharing code) with contractions to display those names, then you may be doing it wrong. And who’s to say that all your users have a first and a last name? (see common misconceptions about names)

What about titles?

So you need a name, and you are sure you can split it. What else do you need? Do you need a title? Does it matter if your drop-down asks for Mr or Mrs or Miss? And by the way, what about Ms and Dr, and Prof, and can you handle Mx now, or do you discriminate? And by the way, how easy is it for the user to change those fields, as required by law?

What other details are you doing that you don’t need? Addresses? Country? Phone number? Email address? Credit Card number? Mother’s maiden name? Name of first pet?

Why does free train WiFi need my gender and age?

Or any website? And Why are those fields mandatory?

As a user, what is my motivation for giving you those details? The more information I give you, the more I have to trust you. And if I don’t trust you, you don’t get my data. If I know why you need it, there’s more chance of you getting my data. If I don’t know why you need it, I will assume you’re selling it, and I may think that even if I know why you need it.

Don’t ask me to trust you, and you won’t be disappointed.

What are you storing because you can, rather than because you need to?

Is your data a security risk, a performance risk, a trust risk? Or can you justify everything, and point to the requirement that details that justification?

If you lost your data to hackers, what would your users be most concerned about being disclosed? Can you stop storing that data?

development programming

Botnet of things


The Internet of Things is the new hotness. It’s the source of Big Data, it’s the future of clothing and wearables and retail and your kitchen. It’s going to be everywhere. Says the hype. Smart watches. Smart fridges. Smart cars. Smart cities.

Part of my is excited, there’s a lot of possibilities, especially once you start hooking them together either with code, or via services such as if-this-then-that.

Stop for a minute though. Consider that we are talking about a heterogeneous collection of internet-connected devices in your house, on your body, on your commute, gathering a lot of data on you and controlling things around you so you don’t have to.

Do what happens when they’re not controlled on your behalf?

These devices have access to:

  • Your WiFi password
  • Your connected services
  • Whatever their sensors pick up (audio, visual, etc)
  • Other devices

Some of them happily connect on unsecured channels.

They are updated according to manufacturer policy (see Android fragmentation and the WebView vulnerability to see how well that works out).

If you accept the 3 rules of network security, and choose not to trust the manufacturer, the cloud services and the network, and want to protect yourself, how do you isolate your threat but still allow the benefits of these devices? How do you isolate the rest of your devices or services if one gets compromised? How do you protect your future data if the services get compromised? How do you protect yourself if your network gets compromised?

Possible solutions:

  • IoT DMZ for WiFi – allow devices to access your WiFI via an authentication key rather than password (similar to one-time passwords for 2FA enabled sites), which only allows them to access an authorised list of sites, and not other nearby devices, managed by your phone/companion app?
  • Direct network connection (Ethernet over power) rather than WiFi
  • Non-personal connection (built-in 4G)
  • local data hub that relays the collected information across your local network to a service you choose
  • Bluetooth, or other close-range set up (or see ChromeCast, which broadcasts an SSID for phone to pick up, then switches to the WiFi you set up)
  • Quick list/disabling of connected services?
  • Token auth rather than password auth
  • Forced updates
  • Non-network updates (my TV allows USB or OTAerial firmware upgrades)
  • Don’t connect your smart device to the network
  • Decide you don’t need internet access on your car, or your fridge.

If you aren’t scared enough yet –

Cybercrime, the security of things :

And don’t forget to patch your car :

data development NMandelbrot

Is your CPU time there to be stolen?

Or can it be bartered? If you were given the choice between giving up cpu time, giving up privacy, or giving up money, to reimburse a developer for their time, which would you choose?

If you don’t feed us, do we not starve?

RT @ppinternational: uTorrent client is stealing your CPU cycles to mine #bitcoin

code data programming ux

#dunddd Analyse This : The dangers of big data

Thanks to everyone who came to my DunDDD talk. Lots of interesting questions, although I’m not a lawyer so couldn’t answer them all.

If you want the slides, with references in the notes, you’ll find them here. All the images are creative commons, and you can use the sides yourself under CC by Attribution. Link to slides : Dunddd Analyse This – The Dangers Of Big Data (Google Drive)

If you missed the talk, the arguments I made and the references, apart from the privacy sections, are in this

Link to previous post

If you want the references for the Personal Data and anonymisation parts, have a look at these :

AOL searches are not private

IBM privacy-preserving data mining