ai artificialintelligence data development search

The UX of Big Data

Following on from my Dangers of Big Data talk at DunDDD, I’ve been thinking about what a good user experience for data analytics would look like, imagining the business user presented with useful, actionable information rather than notepad and a copy of the R or Python cookbook. I want something deceptively simple like the Google search box, rather than deceptively complex like Excel.

Excel, and R and Python, put a lot of tools at your disposal, and you could use any of them to construct an answer, but the secret to analytics relies on getting a valid, useful answer. The first is a matter of restricting the answer space to that which can be supported by the data (for example, disallowing multiplication of time-based input streams, or aggregating when there is no statistical basis for it), the second is a matter of allowing the user to explore the space so they can determine (and where appropriate, train the system to recognise) which factors are most important, how they affect the desired outcome, and how changes to the environment affect these factors.

Then the question becomes, how much should the software take over. Do we have a duty to protect users from themselves by preventing invalid analysis where we can detect it, or do we have to accept that the frustration that will cause leads to alienation and users will be less likely to respond well to further corrections. Even nudging had its possible, as anyone who had been frustrated by grammar checkers can attest. But at least nudging helps the user to understand, rather than putting up roadblocks. Nudging encourages learning, roadblocks encourage switching to another way.

How would you encourage users to handle analysis appropriately?


Paperless and warning free

there's a dog driving that car?
Do you have a licence

As a quick follow up to my post on the new process for endorsements following the demise of the paper counterpart driving licence.

First, a clarification, the change in the DVLA is for the paper counterpart to the photo id licence, not the paper licence that existed before the photo id licences. Many people will have been switched to photo id by moving house though, so it’s only the hardcore who won’t be included.

I got confirmation from the car hire firm 24 hours in advance that I needed to print out my endorsements sheet, by which point I no longer had access to a working printer, so I was glad I’d tried it beforehand. The guy at the desk noted that it was a new scheme, and also mentioned that if I hadn’t printed it out, they would need to call a DVLA verification phone number which is very busy when it’s not shut. So still a number of teething problems to sort out.

Do if you are hiring a car, get yourself over to dvla in advance (any time within 21 days) and get your endorsement sheet printed. It might just save you from long queues and grumpy car hire staff.


Google Code Migration : Genetic Algorithm Templates

With the closure of Google Code, I’ve moved some projects to github. All personal projects so far, but related to talks our blog posts from the past, so may still be of interest.

The first project I want to highlight again is written in C++ and implements genetic algorithms using mainly C++ templates, just to see how powerful they were. It taught me a lot about generic code, and how a poor type system can interfere with the clarity of your code. It also prepared me for one of my first talks, about Genetic Algorithms at a Beauty of Code techmeetup.

I’d like to look at a Python port, to see if my expectations of using dynamic typing would answer the concerns I have about code clarity. For now though, it’s available for reference. It’s not production tested, and there are parts that are embarrassing, but it might be interesting if you want to know what genetic algorithms might look like.

development security

Offline OAuth and the end of paper driving licences

there's a dog driving that car?
Do you have a licence

As I will shortly be hiring a car, I had the opportunity to try out the new process to replace the paper driving licence.

For those not on the UK, the driving licence consists of 2 parts : a photo id that gets renewed every 10 years that details what types of vehicles you can drive and your name and address, and a paper counterpart that details any endorsements or convictions. The main people who care about the paper part are the insurance companies, and by extension, the car hire companies who have to include insurance in their rates.

Prior to their abolition a few weeks ago, the paper licence had to be sent back to the DVLA to have endorsements or convictions added, and again to have them removed after 3 years. I’ll leave the possible opportunities for fraud and disruption as an exercise for the security minded reader.

The new process

In order to view your endorsements now, the dvla have a page on their website, secured by your driving licence number, your national insurance number (for US readers, think social security number), and your postcode, which is printed on your licence. So they don’t quite consider it public information, but it effectively is. If you ever use your payslip and your driving licence to prove your id, someone can see your endorsements, which is only slightly more secure than handing over the old paper licence.

However, in order to provide some security theatre and allow you not to disclose your national insurance number, the page does provide a printout and a time linked key which, when paired with your licence number, allows the recipient to verify your endorsement record directly with the dvla.

I have not had to renew my insurance since the changes, but I notice that the car hire companies hide the new dvla page deep in their terms and conditions, so it’s not given the same prominence as the requirement to provide the photo id part of the licence. Other than the fact the paper licence is no longer valid, I’m not sure of the use cases for this. Maybe it will make more sense when I renew my insurance and I can do all this online.

At the moment it feels like digital for the sake of it, and too many bridges back to the old way.

code development

Cognitive load : fluent interfaces and friendly apis

a friendly robot, sitting down
Is your interface friendly?

To continue my mini series on cognitive load, following my previous post on static vs extension methods, a couple more examples to consider. Extension methods are often used as an entry point into a fluent interface, allowing a style which can be easier to read and eliminate the need for confusing, overly long parameter lists, multiple overloads or parameter objects.

They can also be used to humanize the interface, based on the understanding that the user experience of an API should be as important as the user experience of a website or application.

In the first example, I put together a quick concept for date comparison, to see if a fluent style is more readable than arithmetic operations on dates. Decide for yourself if that’s true. I discovered that making fluent operators is often harder than usual naming considerations, especially when the same word can be used to mean subtly different things.

In the second example, which I found fascinating, chaining is used to turn a conventional c# method into something more akin to a functional language with the intent of making the operations clearer and refocusing unnecessary interim storage steps. I’m not sure this is always the best approach, but I’ll definitely be adding these helper methods to my toolkit.

code development

Cognitive load : static vs extension methods

Cheese board sitting on cans of tomatoes
Extending tools beyond their original purpose

A c# specific example to follow up my blog post on cognitive load, since it coincidentally came up at work.

What are extension methods?

The example concerns c# extension methods. For those of you who aren’t C# developers, they allow you to create a static method that can be called as if it was a class method, without needing to modify the original class. They’re mainly useful for extending classes or basic types with new functionality, such as business-specific rules. For example, the LINQ library extends IEnumerable types with a number of functions, including Where() which provides the filter functionality found in functional programming, allowing users to write code like:

int[] array = {2, 3, 4, 5, 1};
return array.Where(x => x % 2 == 0); 
// returns {2, 4}

The x => x % 2 == 0 in the above snippet is a lambda expression that takes an input x (for the Where method, each member of the array) and returns the result of the expression on the right of the arrow.

LINQ example

The discussion centred around whether extension methods like the above were preferable to code using static methods such as :

int[] array = {2, 3, 4, 5, 1};
return Where(array, x => x % 2 == 0);
// returns {2, 4}

As you can see, here array is passed as the first argument (which is what extension methods do behind the scenes).

The example I raised looked like the following, where a number of these methods are chained together. Note that all the methods return a new result, as they are built to allow immutable collections, such as input buffers.

Static methods

filteredthing = Linq.Where(thing, x => x.Deleted == false);
orderedthing = Linq.OrderBy(filteredthing, x => x.CreatedDate);
result = Linq.Select(orderedthing, x => new y(x.Name, x.Id)); 

Chained extension methods

thing.Where(x => x.Deleted == false)
    .OrderBy(x => x.CreatedDate)
    .Select(x => new y(x.Name, x.Id)); 

I argued that the former requires more thought, and therefore a higher cognitive load to understand which variable to use, and the density of code requires makes it less readable and therefore makes the next developer’s job harder.

String example

The most interesting example however, was the one that started the discussion. Which of the following forms do you prefer, and why?




In particular, note that we don’t have the variable or code proliferation as in the LINQ example, and the return value is an explicitly different type.

The most persuasive argument I heard against the extension method version is that it explicitly works when mystring is null, which leads to a counterintuitive situation where, as developers, we are trained to spot potential NullReferenceException problems, and will immediately view the first form with suspicion, until we realise it is an extension method.


Which form do you prefer, and would either one stop you in your tracks as you were scanning code and break your train of thought.

development leadership programming

CodeCraftConf : technical leadership

Follow @codecraftuk on twitter
CodeCraft logo

I have asked to be a guide for the CodeCraft conference. The tickets for CodeCraftConf are now on sale.

One topic I would like to discuss is technical leadership. The format of the conference states that I will have 12 prompt cards to help guide the discussion. My suggestions for these are shown below, and also available on my github repo. Any comments and suggestions welcome.

Technical Leadership

  1. What is a technical lead?
  2. What most inspired you about your previous technical leads?
  3. What behaviours would you change in technical leads you have worked for?
  4. Why do you want to be a technical lead?
  5. What scares you most about being a technical lead?
  6. How do you measure success as a technical lead?
  7. What one thing would make your life as a technical lead easier?
  8. What responsibilities are you happy to delegate, and what do you want to control?
  9. How do you plan for your own absence, so you can rest on holiday?
  10. What qualities do you want your team to have, and how to you help them get that?
  11. How do you deal with conflicts in the team?
  12. How do you deal with external pressures on the team?
code development leadership

Developer cognitive load

Anything a developer does that doesn’t add business value is wasted cognitive load. That’s why we use abstractions.

I know my developers are smart enough to handle memory management if they needed to, but developers just as smart in the past were less productive and introduced buffer overflow bugs because they had to think about that as well as the business problem they were trying to solve.

I want my team to fix one novel problem a day, rather than 10 papercuts that have already been fixed and get in the way of business logic.

When I see a developer make a mistake, I always need to ask myself if that mistake is a papercut that other developers will suffer from, and if there’s a way to fix it before the next developer encounters it. Code that makes my team more productive is always useful code.