Thoughts on Node.js

Focus on the flow
Focus on the flow

In my post about developing in the cloud, I started promising a few nuggets about the project I was working on, and following my diversion onto talks and security, I’m ready to start discussing it again. The project itself is fairly straightforward. It was an excuse for me to try a realistic project in Node.js using cloud development tools, to see what was possible, and to decide if I wanted to use Node.js for anything more substantial.

Partly, I wanted to immerse myself in a completely callback-driven, “non-blocking” model, so that I can see how it affected the way I thought about the software I was writing, and also to see if I could see the benefits. I’ll expand on some of these thoughts as I talk more about it, but I wanted to start with my initial impressions.

What Node.JS does well

Firstly, from looking at a test example based on graphing data from this great read about green energy, it was clear to me that the promises of Node.js lie very much in the I/O performance side, rather than anything else. If you’re trying to do anything more complex than throwing as many bytes at the response buffer as possible, Node.js is probably the wrong tool. It’s not built for complex business logic. It is good if your server is a thin layer between a JS front-end and something specialised on the backend (or a series of specialist pipes on the backend – the web services support is top notch, as you’d expect from the language that gave us AJAX and JSON).

One of the reasons I started to enjoy using Node.js was the ability to run the same code on client and server, which meant that I could write tests for my JavaScript logic that would run as unit tests in the server context, and then ship the exact same JavaScript to the client, knowing that the internal logic was correct, even if the performance wouldn’t be. This was greatly helped by the fact that I could use Require.JS to bridge the gap between the client and Node’s extensive NPM package repository, which isn’t as easy to search as the apt-get and NuGet repositories I’m more used to, but is fairly comprehensive, although it does suffer from the apt-get problem of having a lot of packages for doing the same thing, and it can be hard to choose which one to use. There are definitely popular options that bubble to the surface, but I get the feeling I’d need to try a few of them to really start to feel which was the right one to use, especially as a few of them have APIs that feel quite alien to the core libraries, and feel like unthinking ports from other languages, or at least non-callback philosophies.

In the end, I got a proof of concept up over a weekend, which is about as long as I’d normally spend on The Mandelbrot Set, which is a nice quick result, and I got multiple users up on the site, which is more of a testament to Cloud 9 as it is to Node, but the resultant code had fewer features and messier code than the alternatives I’d written in other languages. It certainly felt more as if I was fighting against the flow than in previous incarnations, despite the refactoring tools and examples I had available, and it was a lot harder to keep the program flow in my head, since I had to ensure I understood the context that the callback was operating in : which paths could lead to the current code being executed and what I could rely on being set. I tried to trim the code back as much as possible, but I was still debugging a lot more than I was used to in other incarnations, despite more testing.

What Node.JS does to make you grind your teeth

At this point, I’ve written two proofs-of-concept in Node.JS, and whilst I think the 2 projects I tried weren’t the optimal projects for Node.JS, I’m getting a feel for what is can and cannot do. I can see places where I would use it if I was doing streaming or similar low-latency, high-throughput tasks that were just about routing bits, and I have watched it updating several clients at once, but it’s very easy to write blocking code that will slow the server, and has an instant impact on all clients, making them all hang until the blocking operation is completed. That type of error is not unique to Node.JS, but I found the chain of callbacks increasingly more difficult to reason about, making that type of error more likely. It feels like writing GOTOs.

At this point, I can’t see myself using Node.JS for anything other than a very thin layer to some business logic, and at that point, it seems odd to use a thin layer of web services just to call other services in another language. That’s what I’d do to start migrating out a legacy app, but I wouldn’t start a design that way. If I wanted to build a web service backend, there’s no benefit in Node.JS that I couldn’t get in WebAPI. However, I’m wondering whether my next project should be to re-write the backend in Go, to see if that’s any easier.

Summary

Node.JS is fun to play with as a proof of concept of callback-driven code, but I’ve seen the basic idea of a stripped back, high performance web server delivered far more elegantly in Erlang, Go, and other places. Node.JS throws out threads, and doesn’t offer enough in return to justify the switch for me. It’s fast, but not the fastest, and my productivity is definitely lower in Node.JS than it was in my first program in C#, or Java, or Python, or even my first work with client-side JavaScript. It’s an interesting experiment but I’ve got better things to do with my time than reason about which callback failed to trigger. For the right project, I’d give it another run, but the right projects for Node.JS are very specialised.

The three rules of network security

eye
who’s watching you?

I realise for most of the audience, this will be stating the obvious, but I want to cover these rules now so I can refer to them later in the series.

I’m going to do this as a series of 3 posts, so I can refer to each rule separately.

The three rules of network security:

  1. Don’t trust the client;
  2. Don’t trust the server(s);
  3. Don’t trust the network.

In short, don’t trust anything you don’t fully control. I list them separately here since the way we mitigate each is very different.

Troy Hunt covers most of the mitigation strategies and the mechanics of this better than me though, so if you’re interested in this topic, go check him out – Hack Yourself First : http://www.troyhunt.com/2013/05/hack-yourself-first-how-to-go-on.html (or listen to the .Net Rocks podcast – http://www.dotnetrocks.com/default.aspx?showNum=914 )

Don’t trust the client

If you’re running a server, and you don’t validate any user supplied content, please shut down your server now before you put the rest of the Internet at risk. Depending on what you’re processing, that includes any POSTed content, any query string, HTTP headers, the content hosted at any provided URL if you retrieve it, and many other possible inputs.

Even if you trust the content is not harmful to your IT security, you still can’t necessarily trust it. Your survey results will contain untrue data, none of your IE11 users will show up as IE users, and if you’re doing any calculations on the client, they may give the wrong answer due to misguided assumptions (the pixel density of an iPhone just before the retina display was announced) or malice.

One way to adjust for the effects of wrong answers is to aggregate results across many inputs such as the majority voting system employed by the Apollo computers to minimise the effects of computer failure. You can also check for inappropriate behaviour such as a high rate of submissions that indicate gaming or a DoS style attack. There are so many possible attacks that I can’t list them all here.

Don’t trust the server

As a client, you also need to validate what you receive. Any recent browser will sandbox and restrict any code by default and the recent web standards also include well-defined Chinese walls to prevent code from one site intercepting data from another (see, for example, CORS and compare to the old method of JSONP in terms of validation and verification of incoming requests. Of course, you should also be checking that what you are receiving is from the right source (mybank.com instead of mybank.com.some.compromised.server.ru).

In addition though, you also need to trust what the server will do with the data you send it. Will the owners respect your privacy (and remember, if they’re outside the EU, the Data Protection Act does not apply) or will they sell your data? Will they protect your account (by hashing passwords, and only storing what they need, rather than keeping your credit card details on file long after they need them)? If they receive a government request for your data, will they honour it, and will they let you know?

Don’t trust the network

Even if you write both server and client, the data can be changed or lost in the middle. Any public WiFi can be compromised and your traffic intercepted, and there’s only so much HTTP-only and SSL-only cookies can protect you from when your attacker controls your DNS server. Beyond WiFi, agencies such as NSA and GCHQ are watching end points and can intercept some SSL traffic. The padlock is only as secure as the lockmaker. If you’re Google, you can’t even trust your “internal” network between sites. Expect everything that you do not own or you cannot physically trace to be compromised and secure your data and communications appropriately.

2013 in review : agile, dynamic and dangerous data

The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 1,600 times in 2013. If it were a cable car, it would take about 27 trips to carry that many people.

Click here to see the complete report.

#dunddd Analyse This : The dangers of big data

Thanks to everyone who came to my DunDDD talk. Lots of interesting questions, although I’m not a lawyer so couldn’t answer them all.

If you want the slides, with references in the notes, you’ll find them here. All the images are creative commons, and you can use the sides yourself under CC by Attribution. Link to slides : Dunddd Analyse This – The Dangers Of Big Data (Google Drive)

If you missed the talk, the arguments I made and the references, apart from the privacy sections, are in this

Link to previous post

If you want the references for the Personal Data and anonymisation parts, have a look at these :

AOL searches are not private

IBM privacy-preserving data mining

Announcing DunDDD 2013

craignicol:

I hope to make it this year, are there any of my blog posts you want to hear in a talk format?

Originally posted on Scottish Developers:

DDD ScotlandScottish Developers are pleased to announce that DunDDD 2013 will take place on Saturday 23rd November in the Queen Mother Building of the University of Dundee.

http://dun.dddscotland.com

This will be the 3rd DunDDD built on the popular foundation of the Developer! Developer! Developer! conference series which has spread to all corners of the UK and the international arena.  DDD conferences are community-run days where passionate and enthusiastic people come together to learn, share ideas, and to network within the many hubs of the development community. Best of all DDD events are free to everyone.

This year DunDDD will be featuring an entire track dedicated to Data Science.

Call for speakers

We are looking for sessions relating to all aspects of development from the code and technology level through to methodology and theory. DDD conferences are about sharing experience regardless of skill level so even if you have never spoken publicly…

View original 100 more words

So Long Hangouts, And Thanks for all the fish

I’ve been re-evaluating the time I spend on the developer hangouts, and despite the interest I get from the invites, the turnouts have not been high enough to sustain these meetings. As a result, I will not be hosting any more.

If there is interest, I may think about a podcast-like Hangout-On-Air, but I would need some speakers for that, and I think that outlet would be better served by the plans +TechUp Inverness has for streaming their talks, and I might see if I can persuade other Scottish developer talks to make more content available online for streaming and reviewing (although I know from experience at work, there is a significant extra effort involved in setting such a thing up).

It’s been an interesting experiment, but it’s time to try something else. In the short term, I will be using the time I spent on this to update my blog as I’ve got a backlog of posts about node.js that I want to complete.

Many thanks to everyone who attended any of the hangouts this year. I hope to catch up with you all again in person at other events.

Users or consumers?

I’ve just been reading this article about how capitalism has changed our language in the last 200 years, and the move from talking about people as users of products to consumers. “How capitalism has changed our language” http://feedly.com/k/18naQTK

There is an old computer joke that about programmers referring to their customers as users, but this article highlighted to me something deeper that feeds into something about how technology is shaping our world.

Software development, when it works best, happens with active users rather than passive consumers, people who have an active interest in the problem the software is designed to solve, and sometimes ideas on how to solve it.

The biggest red flag I have to indicate the success or failure of a software project is the lack of user engagement. If no-one on the project team is a user, or is listening to the users, the project will almost certainly fail.

Software customers are not consumers, and often the user is not a customer, and they will only use the software they are forced to use by management or that they are comfortable with, and will likely not use it if they know a better alternative. If we can’t write software that engages users and makes their life easier, they will not use it.

This may not be a problem unique to software development, but at least the language of software recognises that there is a problem to solve, and that the user experience is the difference between software that delights and software that frustrates.

ZenScript

My eBooks are available at Amazon Worldwide

JamesRadcliffe.com

James Radcliffe, Musician. Music, Blog, Pictures, Live, News...

The Code Club Blog

Adventures in Teaching Kids How to Code

The Inner Donkey Sanctuary

Taking care of my inner donkey since 1977

Stack trace ramblings

Just another WordPress.com weblog

The DIEM Project

Dynamic Images and Eye Movements

Voidspace - Cyberpunk, Spirituality, & Python

Just another WordPress.com weblog

the HabitForge blog

Simple Accountability. Positive Change.

Mainly programming, with a bit of science thrown in

codeface

on making software

Don Charisma

because anything is possible with Charisma

Kendall F. Person, thepublicblogger

where writing is a performance art and every post is a show

THIS DAY, THIS SONG

Discover a new song every day

Scottish Games Network

Representing the video games industry in Scotland

UK Constitutional Law Association

affiliated to the IACL

Coding Nuggets

Musing of a .Net coder based in Glasgow, Scotland.

justanotherhumanoid

This WordPress.com site is the bee's knees

Free Scotland Now!

I believe in Scots independence and actively campaigning to make it happen.

Follow

Get every new post delivered to your Inbox.

Join 1,377 other followers

%d bloggers like this: