Documentation is just comments in a separate file. At least developers can see the comments when they change code. Tests are better comments. Tests know when they don’t match the code.
Infrastructure is the same. I can write a checklist to set up an environment, and write an architecture diagram to define it, but as soon as I change something in production, it’s out of date, unless it’s very high level, and therefore only useful to provide an outline, not detail.
Unit tests document code, acceptance tests document requirements, code analysis documents style and readability, and desired-state-configuration documents infrastructure. All of them can be checked automatically every time you commit.
Documentation as code means the documentation is executable. It doesn’t always mean it’s human readable; ARM templates in particular can be impenetrable at times. If machines understand it, the documentation can be tested continuously, repeated endlessly across multiple environments, reconfigured and redeployed at the stroke of a keyboard.
The more human you have in a process, the more opportunities for human error. It doesn’t remove mistakes, but it’s much easier to stop the same mistake happening twice.
Meltdown and Spectre shouldn’t change any of this. But now is a good time to think about it. Make 2018 the year you become paranoid about users, 3rd parties and other threats. The year is still young, but the exploits are piling up.
We’ve all experienced the performance implications of saving files. We use buffers, and we use them asynchronously.
Azure storage has Blobs. They look like files, and they fail like files if you write a line at a time rather than buffering. But they’re not files, they’re data structures, and you need to understand them as you need to understand O(N) when looking at Arrays and Linked Lists. Are you optimising for inserts, appends or reads, or cost?
I know it’s tempting to ignore the performance and just blame “the network” but you own your dependencies and you can do better. Understand your tools, and embrace data structures as persistent storage. After all, in the new serverless world, that’s all you’ve got.
Your cloud provider is a dependency. That makes it your responsibility. Each will give you features you can’t get on your own. They give you an ecosystem you can’t get from your desktop, and a platform to collaborate with others. They give you federated logins, global backups and recovery, content delivery networks, load balancing on a vast scale. But if the worst happens, know how to recover. “It’s in the cloud” is not a disaster recovery strategy, just ask the GitLab customers (although well played to them on their honesty so the rest of us can learn). Have your own backup. And remember, it’s not a backup unless you’ve verified you can restore.
It takes you 60 seconds to deploy to your current provider. How long does it take to deploy if that service goes dark?
In cloud thinking, it’s less about the O(N), it’s relatively easy to scale to the input, as long as you’re not exponential. In the cloud, it’s about O($) – how well does your code scale to the amount of money you throw at your servers (or inversely, what’s the code increase per user).
Fixed costs are vanishingly small in the cloud, but incremental costs can change quickly, depending on your base platform. Not the provider, as costs between them are racing to the bottom, but the platform of your architecture.
Quite simply, the more control you ask for, the more you pay in time, and the bigger the ramp-up steps. Get metal, and you’ll pay for everything you don’t use, get a VM and you can scale quicker to match demand, get a container, get an Electric Beanstalk or an Azure website and give up the OS, get a Lambda or a Function.
I can’t recommend to you how much you should abstract. I don’t know how big your ops team are, or how much computation you need to do for each user. I suspect you might not know either. Stop optimising for things you don’t care about. Optimise for user experience and optimise for cost per user, and measure everything.
But you’re not. In security, we talk about 2-factor authentication, where 2 factor is 2 out of 3 : who you are, what do you know, and what do you have. Who you are is the product, a subset of a target market for advertising, or a data point in a data collection scoop. The former requires giving up privacy, the latter less so.
Advertising is about segmenting audiences and focusing campaigns, so views and clicks both matter, to feed into demographics and success measures. Ad blocking is a double whammy – no ads displayed, and no data on you. Websites tend to argue that the former deprives them of revenue, many users argue that the latter deprives them of privacy.
What you have is money, and who you are is part of a demographic than can be monetised in order to advertise to you to get your money.
But what else do you have? If you’re on the web you have a CPU that can be used to compute something, whether it’s looking for aliens or looking for cancerous cells. If you’re happy to give up your CPU time.
Who else are you? You might be an influencer. You might be a data point in a freemium model that makes the premium model more valuable (hello, LinkedIn).
What do you know? If you’re a human you know how to read a CAPTCHA (maybe), you could know multiple languages. Maybe you know everything about porpoises and you can tell Wikipedia.
Your worth to a website isn’t always about the money you give them, or the money they can make from selling your data. It’s the way we’ve been trained to think, but there’s so much else we can do for value.
The Internet of Things is the new hotness. It’s the source of Big Data, it’s the future of clothing and wearables and retail and your kitchen. It’s going to be everywhere. Says the hype. Smart watches. Smart fridges. Smart cars. Smart cities.
Part of my is excited, there’s a lot of possibilities, especially once you start hooking them together either with code, or via services such as if-this-then-that.
Stop for a minute though. Consider that we are talking about a heterogeneous collection of internet-connected devices in your house, on your body, on your commute, gathering a lot of data on you and controlling things around you so you don’t have to.
Do what happens when they’re not controlled on your behalf?
These devices have access to:
Your WiFi password
Your connected services
Whatever their sensors pick up (audio, visual, etc)
If you accept the 3 rules of network security, and choose not to trust the manufacturer, the cloud services and the network, and want to protect yourself, how do you isolate your threat but still allow the benefits of these devices? How do you isolate the rest of your devices or services if one gets compromised? How do you protect your future data if the services get compromised? How do you protect yourself if your network gets compromised?
IoT DMZ for WiFi – allow devices to access your WiFI via an authentication key rather than password (similar to one-time passwords for 2FA enabled sites), which only allows them to access an authorised list of sites, and not other nearby devices, managed by your phone/companion app?
Direct network connection (Ethernet over power) rather than WiFi
Non-personal connection (built-in 4G)
local data hub that relays the collected information across your local network to a service you choose
Bluetooth, or other close-range set up (or see ChromeCast, which broadcasts an SSID for phone to pick up, then switches to the WiFi you set up)
Quick list/disabling of connected services?
Token auth rather than password auth
Non-network updates (my TV allows USB or OTAerial firmware upgrades)
Don’t connect your smart device to the network
Decide you don’t need internet access on your car, or your fridge.
In my post about developing in the cloud, I started promising a few nuggets about the project I was working on, and following my diversion onto talks and security, I’m ready to start discussing it again. The project itself is fairly straightforward. It was an excuse for me to try a realistic project in Node.js using cloud development tools, to see what was possible, and to decide if I wanted to use Node.js for anything more substantial.
Partly, I wanted to immerse myself in a completely callback-driven, “non-blocking” model, so that I can see how it affected the way I thought about the software I was writing, and also to see if I could see the benefits. I’ll expand on some of these thoughts as I talk more about it, but I wanted to start with my initial impressions.
What Node.JS does well
Firstly, from looking at a test example based on graphing data from this great read about green energy, it was clear to me that the promises of Node.js lie very much in the I/O performance side, rather than anything else. If you’re trying to do anything more complex than throwing as many bytes at the response buffer as possible, Node.js is probably the wrong tool. It’s not built for complex business logic. It is good if your server is a thin layer between a JS front-end and something specialised on the backend (or a series of specialist pipes on the backend – the web services support is top notch, as you’d expect from the language that gave us AJAX and JSON).
In the end, I got a proof of concept up over a weekend, which is about as long as I’d normally spend on The Mandelbrot Set, which is a nice quick result, and I got multiple users up on the site, which is more of a testament to Cloud 9 as it is to Node, but the resultant code had fewer features and messier code than the alternatives I’d written in other languages. It certainly felt more as if I was fighting against the flow than in previous incarnations, despite the refactoring tools and examples I had available, and it was a lot harder to keep the program flow in my head, since I had to ensure I understood the context that the callback was operating in : which paths could lead to the current code being executed and what I could rely on being set. I tried to trim the code back as much as possible, but I was still debugging a lot more than I was used to in other incarnations, despite more testing.
What Node.JS does to make you grind your teeth
At this point, I’ve written two proofs-of-concept in Node.JS, and whilst I think the 2 projects I tried weren’t the optimal projects for Node.JS, I’m getting a feel for what is can and cannot do. I can see places where I would use it if I was doing streaming or similar low-latency, high-throughput tasks that were just about routing bits, and I have watched it updating several clients at once, but it’s very easy to write blocking code that will slow the server, and has an instant impact on all clients, making them all hang until the blocking operation is completed. That type of error is not unique to Node.JS, but I found the chain of callbacks increasingly more difficult to reason about, making that type of error more likely. It feels like writing GOTOs.
At this point, I can’t see myself using Node.JS for anything other than a very thin layer to some business logic, and at that point, it seems odd to use a thin layer of web services just to call other services in another language. That’s what I’d do to start migrating out a legacy app, but I wouldn’t start a design that way. If I wanted to build a web service backend, there’s no benefit in Node.JS that I couldn’t get in WebAPI. However, I’m wondering whether my next project should be to re-write the backend in Go, to see if that’s any easier.
After my Post PC post, and with an interest in node.js I decided to see if it was now possible to develop a reasonably complex bit of software, with structure and tests, having nothing more than a Web browser installed. I looked at a few options but decided on cloud9 http://c9.io because it has a Chrome app, supports GitHub, BitBucket and Azure, and they are the custodians of the open source Web text editor formally known as Mozilla Skywriter, and all their server code is available on GitHub. They also give you a bash terminal, which makes git and mercurial feel at home far more than on a DOS prompt. As I will be making this code openly available, I have no privacy concerns with using the cloud, but if this was a commercial project, I may have different concerns, although, since Cloud9 is an open source project, I would be able to create a private install so I could use a netbook or tablet to write, compile and run code on a server.
My first view was that this was a pleasant surprise. I think with software like this, it is entirely possible to do web development on the web, with full support for most of what I do in my day job, up to deployment. Writing native software is still a few steps behind, although with projects like PhoneGap Build, there’s not much of the loop left to close.
As a UNIX developer by default (and a Windows developer by day), I found Cloud9 very familiar, and despite a few refresh bugs, I felt very productive, and was able to quickly code, build, unit test, and deploy to a temporary staging environment without having to learn anything new, creating shell scripts to help me out along the way, which was a great bonus as I was learning node.js. Unlike my laptop, it also has auto-save and hibernate, so if my connection fails, I don’t lose my edit, and can easily pick up where I left off.
Compared to my usual workday environment of Visual Studio + CodeRush, there’s a lot of features that I’m used to missing, such as many of the code templates and refactorings, but node.js needs a lot less typing than c#, so it’s less of a problem than it would otherwise be. It’s not a showstopper, but I do feel slightly at a loss when switching between them.
Going on this experience, I would say that the cloud is ready for developers, at least if you’re developing for the web, and you’re developing in the open. The usual caveats about cloud security and potential loss of services apply (keep a local copy of your repo if you want to guarantee you’ll always have it, for example), but the web definitely is now powerful enough to develop for itself, and that makes it a powerful platform. Hat tip to the Cloud9 team, and I’ll tell you more about my project next time.
Having submitted a talk on html5, and calling it “The Language of Cloud Computing” (please vote for it if you are interested), I thought I should take this opportunity to discuss how I see the possibilities of the cloud. I do web development in my day job and we use some of the technology that is now discussed as cloud technologies, but this is my perspective as an end user. I will put a warning here, that there is a bit of blue-sky thinking in this post.
My original germ of this post came after reading this post by Gary Short: Cloud Computing – It’s that New Old Thing – Gary’s Blog and as I’ve been writing this post, I think that he’s the angel sitting on one shoulder telling you to be careful, to watch out for snake oil and to mind the gap, which makes me the devil on the other shoulder telling you to go on, jump in, the water’s lovely, and what could possibly go wrong? With that in mind, I recommend you read his post and keep it in mind.
There are two distinct shifts I have seen that are classed as “Cloud Computing”. On the server side, machines are becoming virtualised which allows for greater flexibility and a more efficient use of resources (go green rangers). I use VMs all the time, and I think they’re great. Of course, mine aren’t hosted across the world, like Amazon Elastic Compute Cloud (Amazon EC2) or Windows Azure. On the client side, it’s about the trend to save your data “somewhere else” so you don’t need local storage and you can access your data from everywhere. That seems to be the underlying vision of Google Chrome OS and the Apple iPad and it has a certain appeal when you don’t have to worry about backups, running low on disk space, and other maintenance tasks. Why not give it to someone else to worry about? And if that’s not enough, put your photos on Flickr!, edit them on Picnik, and then share them on Facebook, giving a lot more options than you could have on your own, and allowing others to manipulate, improve, and remix your data.
But what if you pick the wrong service? Picnik currently doesn’t work with Windows Live Photos so you miss out on that service unless you upload the photos from your computer. And what happens when you’ve been using Yahoo 360 for blogging and it disappears from the web?
What we realy need is some proper data portability. You don’t want “Facebook compatible” or “Works with Flickr”. Google Reader is great because it works with RSS, and there’s a lot of RSS out there to work with. We want something as simple as “Yes, it’s a JPG, I can work with that”, just like I can on my computer. And more than that, we need to be able to move our data from one cloud provider to another.
My ideal model for how “The Cloud” should be working, for me, would be that every cloud provider would support standard data transfer protocols, whether it’s ftp, imap for email, or something new, that supports open authentication to transfer directly between providers as well as to my local storage. The key is making it easy to synchronise my data. Yes, some of it will be world-accessable, with all the privacy issues that presents, and yes, some providers will be better at some things than others, but making it easy to transfer my data between providers means that I can move around to get the best of all worlds, instead of choosing this place for greater storage instead of that one that lets me share, or the other one with better editing. There’s loads of people who make it easy to get your data in, they support multiple clients, have a nice web uploader, annd so on, but it’s often a lot harder to get your data out. The Data Liberation Front (the Data Liberation Front) is a nice step in the right direction from Google, and I think is a good, basic philosophy for Cloud providers to follow, but that should be the bare minimum I should expect from a provider. If they don’t follow that model, they are not “The Cloud”, they are just silos full of quicksand, and the more you struggle to get you data out, the more it seems to be lost forever.
So, will html5 help with any of this? AJAX has become a reason for providers to become more open, to encourage traffic back to their own sites, and adverts, but with html5, we get, for example:
microdata, to help define areas of pages that other sites can understand, such as calendar events and contact details;
structural elements, like nav, to help parsers find the interesting data;
Cross-document messaging to allow pages from one domain to communicate with pages on another domain, with a security model to prevent XSS attacks; and
protocol handler registration to allow a website to declare that it can handle fax: or mailto: links, or jpg or apk files so I can grab a link or file from one site and sent it directly to another site that I trust.
I don’t know how many cloud providers will start to use these features, but until they do, the cloud is stunted.