Categories
code cosmosdb data development

Cosmosdb and Heterogeneous data

A selection of different watches. They all tell the time, but some are analogues, some are digital, some are branded, and some are not.
Same, but different

CosmosDb, in common with other NoSQL databases, is schema-free. In other words, it doesn’t validate incoming data by default. This is a feature, not a bug. But it’s a dramatic change in thinking, akin to moving to a dynamically typed language from a statically typed one (and not, as it might first appear, moving from a strongly typed to a weakly typed one).

For those of us coming from a SQL or OO background, it’s tempting to use objects, possibly nested, to represent and validate the data, and hence encourage all the data within a collection to have the same structure (give or take some optional fields). This works, but it doesn’t provide all the benefits of moving away from a structured database. And it inherits from classic ORMs the migration problem when the objects and schema need to change. It can very easily lead to a fragile big-bang deployment.

For those of us used to dynamic languages and are comfortable with Python’s duck typing or the optional-by-default sparse mapping required to use continuously-versioned JSON-based RESTful services, there’s an obvious alternative. Be generous in what you accept.

If I have a smart home, packed with sensors, I could create a subset of core data with time, sensor identifier and a warning flag. So long as the website knows if that identifier is a smoke alarm or a thermostat, it can alert the user appropriately. But on top of that, the smoke alarm can store particle count, battery level, mains power status, a flag for test mode enabled, and the thermostat can have a temperature value, current programme state, boiler status, etc, both tied into the same stream.

Why would I want to do this?

Versioning

Have historic and current data from a device/user in one place, recorded accurately as how it was delivered (so that you can tweak the algorithm to fix that timedrift bug) rather than having to reformat all your historical data when you know only a small subset will ever be read again.

Data siblings

Take all the similar data together for unified analysis – such as multiple thermostat models with the same base properties but different configurations. This allows you to generate a temperature trend across devices, even as the sensors change, if sensors are all from different manufacturers, and across anything with a temperature sensor.

Co-location

If you’re making good use of cosmosdb partitions you may want to keep certain data within a partition to optimise queries. For example, a customer, all of their devices, and aggregated summaries of their activity. You can do this by partitioning on the customer id, and collecting the different types of data into one collection.

Conclusion

NoSQL is not 3NF, so throw put those textbooks and start thinking of data as more dynamic and freeform. You can still enforce structure if you want to, but think about if you’re causing yourself pain further down the road.

Check out @craignicol’s Tweet: https://twitter.com/craignicol/status/1122224379658633217?s=09

Categories
.net code development

Thoughts on Global Azure Bootcamp 2019

I’ve got a collaborative post coming up on the talks themselves on my employer’s blog but as a speaker and tech enthusiast, I wanted to share a few thoughts on the bootcamp as a whole.

Firstly, I’d like to thank Gregor Suttie who organised the Glasgow chapter under the Glasgow Azure User Group banner.

It’s an impressive feat getting so many speakers on so many topics around the world. Each city is going to be limited in the talks they can offer, but I was impressed by the distances some of the speakers at the Glasgow event traveled.

I would have liked to have been more of a global feel. I know the challenges of live video, especially interactive, but this is an event that would benefit from some of that (or indeed, more speakers participating the the online bootcamp via pre-recorded YouTube videos). I realize an event like Google I/O or Microsoft Build is different in focus, being company rather than community driven, but it felt like a set of parallel events rather than one, so it also feels as if some of the content is going to be lost, and there were a lot of interesting looking talks in other cities that turned up on the Twitter hashtags.

There’s obviously a lot to cover under the Azure umbrella, so it’s going to be hard to find talks that interest all the audience all the time, and it was hard to know where to pitch the content. I aimed for an overview for beginners which I think was the right CosmosDb pitch for the Glasgow audience, but I was helped by the serendipity of coming after an event sourcing talk so I could stand on the shoulders of that talk for some of my content.

I would maybe have liked to see more “virtual tracks” so that it was easier to track themes within the hashtags, whether it’s general themes like “data” or “serverless” or technology/tool focused like “CosmosDB”, “Azure DevOps” or “Office 365”, to help me connect with the other channels and see what content is must interesting to follow up. Although Twitter was a good basis for it, I think there’s scope to build a conversational overview on top, into which YouTube videos, Twitter content, Github links, blog posts, official documentation and slide content could be fed.

As a speaker, the biggest challenge was keeping my knowledge up to date with all the updates that are happening, and events like this do help, but as I’ve been on a project that’s Docker and SQL focused recently, it’s a lot of work on top of my day job to keep in touch with the latest updates to CosmosDb, especially as a few tickets on the project were picked up and moved to “In Progress” between submitting and delivering my talk, and a new C# SDK was released.

Categories
.net code data development

CosmosDb in The Real World : Azure Global Bootcamp 2019 (Glasgow)

Thank you to those who came to my talk today about CosmosDb. I hope you found it useful.

If you’d like to review the slides, you’ll find the presentation online here :

CosmosDb In The Real World – GitPitch

If you have any further questions please ask below and I’ll do my best to answer.

Categories
code

Live coding during a presentation

I’ve done a few talks in my time, and some of them have involved live coding to help take the audience on a journey to understand the lesson I want to teach. It’s a technique I use rarely however as it can easily get problematic.

Live coding for a presentation is not the same thing as coding for your job or a mob session. Live coding for a presentation has to be as slick and as polished as the rest of the talk. No googling the API docs, no stumbling over syntax errors (although don’t worry if you don’t spot them, someone will.)

If you’re nervous with slides, live coding can help relieve some of the pressure. It certainly helped me when I was starting out, as I could pretend I was having a conversation with my computer for some of the talk. Fortunately, it was a conference with microphones.

Live coding is also great for presenting something novel (or at least novel for most of the audience) as it gets to demonstrate that it’s real and it works. I used it, in one presentation, to demonstrate TDD.

Be careful though. Technology can let you down, so have a think about how you might demonstrate another way. Do you have a video you can show on someone else’s laptop? Can you demonstrate with screenshots? Could you whiteboard a solution?

If you don’t feel comfortable writing code in front of an audience, don’t. Can the code (I use VS code snippets, but there’s plenty of macro choices), so you can reveal it gradually. Or pre-build it and uncomment as you go.

If you are preparing code in advance, to distribute with the slides, heavily comment the code so that when people download it afterwards they understand the context.

I do love to see live coding done well in a presentation, but unless you’ve prepared it meticulously, it won’t go well. Think about the story, and how and when you’re introducing new concepts. Don’t be afraid to go back to something later (“Here’s how to connect, but I want to show you search first. I’ll come back to the parameters here once you’ve seen the search results.”) It’s OK to refer to your notes, especially if you’ve just thrown up a screenful of code for the audience to read. Make every interaction useful.

It’s OK to backtrack. Show the mistakes you made, the error messages you saw, and how you corrected them. You won’t be the last person to make that mistake.

Relax and enjoy it. If code is what you do, be comfortable with it, even if the nerves are trying to turn you inside out from your stomach.

Categories
code development

Things I don’t know as of 2018

At the end of last year, Dan Abramov talked about the things he doesn’t know. It’s a useful framework to reflect, and as a privileged, experienced developer, I am comfortable talking about this, in the hope it might help others understand that experienced developers don’t know everything.

I’ll echo what he says about Imposter Syndrome and Dunning-Kruger and how it absolutely sucks to see truly smart developers I know, or that I follow, having to constantly prove their credentials just because they’re not cis, straight, able-bodied white men. If I’m in a position to support you or amplify you, I will try to do so. Your experience is definitely a thing I don’t know, but I can listen.

I covered a few things I know enough to be confident in for my 2018 in Review post. I’ve had to care about DevOps and containers and pipelines because they solved real problems my team was having. I love graphics because trigonometry and fractals are why I got into programming, but I work in the network, looking at architecture, security, interfaces and data. That leaves a big wide world out there, and this doesn’t include the unknown unknowns.

Ruby

I’ve tried to get into it a few times, but it still just looks like old-school Perl to me. Lots of punctuation occasionally interrupted by domain language. This actually sounds like a good idea to me – an almost complete separation of concerns – but I keep finding the learning curve too steep and so I retreat back to Python and C#.

CSS

I’ve been building websites since Mosaic was a browser. I can style things with CSS. It’s not hard to put red text on a green background (you monster), or fix a navigation bar to the top of the screen, but look at the fancy grid layouts, or parallax scrolling effects, or CSS icons, and I couldn’t tell you where to start. I don’t know how to unit test it, so my tools don’t work here. Just as well I work with some very smart CSS developers.

Markdown

I use Markdown a lot. For README, for Architecture Decisions, and for drafting blog posts, but I always need to google how to insert a link, or a table, or an image. And I feel like an idiot each time.

gulp, npm and webpack

I know these are essential tools when I’m building React. I know they turn my code into something the the browser runs. I know how dependency management works (I can bore you about HORN over a drink if you like), but I’ve only ever used these JavaScript tools preconfigured in a template or an existing codebase, so I could not create a JS or TypeScript build pipeline from scratch, or fix a broken one.

I really couldn’t tell you anything intelligent about how those fit into the ecosystem alongside Grunt, Yarn, bower, sass, scss or anything else. So long as I can write a script to install and run them, I’m happy not to have to dig into them. All I care about is if it runs and if the dependencies are secure.

I suspect I’ll need to learn more about these tools this year though.

You don’t need to know everything

You’ll have your resolutions to learn more, whether weekly or yearly. There’s more to learn than you will ever get the time to – that’s why we work in terms. Pick one thing to learn next because it’s important to you, because it’s interesting, because your project requires it, or because thats where the money is. Forget the rest until you’re over the Dunning-Kruger curve and you know enough about that new thing to see all the possible next things to learn.

Categories
code development programming

Software Development as a Christmas Story

Gingerbread Christmas Tree
Oh Tanenbaum

We decorated the house for Christmas. A smaller project than most of the software I’ve worked on, but a useful reflection.

The Cost of context switching

Sometimes, in the middle of one task, you need to do another. Either because you were interrupted, or because you weren’t prepared. Consider the tree, a tall tree, one you need a ladder to decorate. But you left the 🌟 in the attic.

You don’t just have the cost of picking up the star, you need to get down the ladder, fold it up, carry it upstairs, open the attic, unfold the ladder, climb to the attic, find the star, then pick it up and unwind all the other steps just to get back where you stated before you can continue the task of putting the Star on the tree.

It’s just a minute to get the star…

Yak shaving

Sometimes it’s more than just one thing. Sometimes to get to where you need to go, there’s a cascade of other tasks.

You want to put the tree up, but it’s the first time you’ve had a real tree, so you need a new base. It’s bigger than last year’s tree, so your lights don’t fit. So you need to go to the shops. But you need to fill up your car. And the tyre pressure warning light comes on so you need to top them up. And you need to check the tread depth whilst you’re there, and so on.

Programming in a nutshell

Performance at scale

Our tree stood up fine when it was delivered. But as it scaled out and the branches widened, it pushed against the wall, making it unstable in the condensed space. It fell over.

Luckily no lights or baubles were using it at the time, but it’s an interesting challenge holding up a heavy tree in one hand, trying to adjust it’s position with the other hand, as I avoided the puddles of water in the floor as my wife mopped them up. If you’ve ever worked support, this may sound familiar.

Turns out that it was harder to stabilise than I anticipated.

The branches were unevenly partitioned, providing more load on one side, so I had to stabilise it against the wall. And the tree was almost a foot taller than expected, which turned it out to be 2 foot taller than the base was rated for.

We upgraded the base to handle the load. It’s bigger than we need.

Technical debt

I have some old pre-LED tree lights and they slowly fail so each year I replace the unlit ones from the packet of spares but I haven’t been replacing the spares. Eventually, they will run out, and the spares will no longer be available. I’ll have to throw them all out and start again.

The new led ones don’t let me replace them individually. But they last longer and are cheaper to run.

Which debt is easier to live with?

With a big tree, those old lights aren’t long enough. So, do I buy another set for the rest of the tree that doesn’t match them, or throw out the existing ones and buy a longer set? The latter looks better, but throws out the sunk cost along with the lights.

You know computers, can you look at this for me?

No, I can’t fix your tree. I can navigate a garden centre. I know enough physics to keep a tree upright, eventually, and safely put the angel on, but I’ve no Idea how to grow one, or what that weird stand with 17 gears you bought does, or how to assemble your plastic one. And I’ve no idea what that bug in the tree is.

Merry Christmas. Catch you all in the New Year.

Categories
code development programming

How I approach coding challenges

This time last year I was doing a lots of tests for job interviews, and I was practicing with https://adventofcode.com

I know there’s various data structures and algorithms you can learn to make these easier (and I hope to come back to some of my favourites in a later post), but there’s a general approach I take to these that helps me, and might help some of you too.

Use TDD

I’ve always been a fan of Test-Driven Development, and these type of puzzles lend themselves well to TDD thinking. Often examples are provided that will form the initial test suite. The algorithm is often broken into steps (sometimes explicitly “On every clock tick…”)

Get into the habit of checking as you go to make the most of the information provided to you. Think about the problem first, and the mapping between the input and the output, before you write any code.

Check everything on simple solutions.

Have templates

Most challenges involve a few common tasks. Read input from a file (ignoring blank lines), create a list of…, output as CSV, etc.

Some of these will have standardised patterns in your chosen language, and you should use them. Some will likely require some error handling. Some will require loading from standard or 3rd party libraries. And all will require a test framework.

As you go, build up a utilities list of functions, classes and packages that you know how to use, are robust enough and have minimal clutter. They’re not meant to be usable by anyone but you, but you will use them a lot. Collect and cherish them.

Think simple

Technical tests and coding challenges, in general, are designed to have short, easy solutions. If you need to solve P == NP to make your code fast, it’s likely you’re using an algorithm that’s a poor fit. Find 2 other ways to solve the problem, and pick the simpler one.

Simpler than that

Think about the simplest solution that could possibly work and write it simply.
Make the code readable. This is not the time for clever solutions. When it fails on an input, make it easy to change.

But not that simple

You still need to think about edge cases, and memory usage, and a myriad of other resource constraints. Think about what could go wrong, within the confines of the puzzle. Advent of Code has a particular knack for detecting edge cases in your code for Part 2 of each day’s challenge.

Think fast

Beware of premature optimisation, but expect long inputs and many loops if you use the naive implementation. Learn new algorithms and data structures.

Categories
code development programming security

This is raw

This is raw chicken : 🐤

If you eat it like that, you may get hurt immediately, by its beak, or its claws. It may grab your money and run off with it.

If you want to eat it, better to kill it first. 💀

If you eat it like that, you may get hurt or die, in a few hours, or days. Washing it won’t help.

Cook it. Cook it well. If there’s any sign of pink, cook it some more. 🔥

It might still kill you, but at least you’re a lot safer than when you started.


This is raw data : 🐤

If you display it, users will get hurt immediately, whether by cross-site scripting, cookie sniffing, crypt-currency mining, or something else. If you’re lucky, it will be something your user’s see immediately and leave your site never to return. Otherwise they may get infected.

If you want to use it, better to validate it first. 💀

If you save it like that, your users are still vulnerable. It might appear on the front end in a different form. It might be a string of unicode characters that crashes your phone. It might be a link to somewhere they can’t trust.

Encapsulate it. Sandbox it. Never trust it, in or out . HTML encode, whitelist the output as well as the input.

And if you need to avoid spam, or incitement, or solicitation, maybe you need editors. Computers can’t fix all the social problems. 🔥

Categories
code development leadership

Tribe of Mentors – 11 questions

I’ve been listening to the Tribe of Mentors podcast by Timothy Ferris and in the related book, he talks about the 11 questions he sent out to get people to respond. He’s got a great discussion about how he refined the questions, and why he chose those questions in that order, and I’d recommend checking out the book and the podcast. I thought it would be interesting for me to answer them. If I’m going to write a blog, I should give you some insight into who I am.

The questions he asks all his interviewees are:

What is the book (or books) you’ve given most as a gift, and why? Or what are one to three books that have greatly influenced your life?

What purchase of $100 or less has most positively impacted your life in the last six months (or in recent memory)?

To give me a significantly more productive commute:

  • I have a small notebook that I can carry in my pocket, from my local supermarket;
  • A cheap bluetooth earphone for listening to podcasts, which is good for voice, but not great for music;
  • Pocket casts for managing my podcasts. I also set it up to auto-play whenever I turn by earphone on.

How has a failure, or apparent failure, set you up for later success? Do you have a “favorite failure” of yours?

Deleting a database

If you could have a gigantic billboard anywhere with anything on it — metaphorically speaking, getting a message out to millions or billions — what would it say and why? It could be a few words or a paragraph. (If helpful, it can be someone else’s quote: Are there any quotes you think of often or live your life by?)

“Do not compare yourself to others for you may become vain or bitter. There will always be greater and lesser persons than yourself.”

Desiderata, Max Ehrmann

What is one of the best or most worthwhile investments you’ve ever made? (Could be an investment of money, time, energy, etc.)

My PhD. I learned discipline, I took the opportunity to learn lots of new technologies, and to sit in on some random lectures.

What is an unusual habit or an absurd thing that you love?

I love weird films and music, such as Trout Mask Replica, Eraserhead. Things that I don’t choose all the time, but reset my expectations about what art can be. It’s always good to get a refresh to help my problem-solving and avoid getting stuck in a rut.

In the last five years, what new belief, behavior, or habit has most improved your life?

Regular blogging. It has helped me to crystallise my thoughts; share ideas with others, and my future self; and led to some great conversations.

What advice would you give to a smart, driven college student about to enter the “real world”? What advice should they ignore?

It’s OK if it takes you a few years to find the right job for you. It’s ok to have a few jobs on your CV if you have a reason to move. Treat them as a chance to find out who you are, and don’t let golden handcuffs make that decision. Bank the cheque in case you need a parachute.

Don’t work on your free time to get a job. Do what you enjoy. Climb hills. Practice the guitar. Volunteer at the hospice. Spend time with friends and family. Jobs that require you to learn on your own time are jobs that don’t respect you.

What are bad recommendations you hear in your profession or area of expertise?

April Wensel covers this better than me. I felt an unease about things, but April Wensel managed to capture it much better than I could. Don’t be a hero. Be a team player. Don’t follow these toxic tips.

In the last five years, what have you become better at saying no to (distractions, invitations, etc.)? What new realizations and/or approaches helped? Any other tips?

I became a dad during this period, and time has become far more precious as a result. Family comes first. So less TV in the evenings, more time with my daughter, more together time with my wife. Less news, less shouting at the TV. If it’s not something that I’m passionate about, or will help my family, me or my friends, I can’t justify the time.

When you feel overwhelmed or unfocused, or have lost your focus temporarily, what do you do? (If helpful: What questions do you ask yourself?)

Go back to my task list. What’s the best thing to do now that my future self will thank me for?

Categories
.net code

Self-testing configuration – the example.config story

Some halloween cakes reflected in a mirror
Mirror image

We’re all good developers and never store production secrets in our source code. We have to keep them somewhere though.

If you’re an ASP.Net developer and you store your secrets in a separate file, how do you check them on release?

In a previous job, we were building an application in AWS EC2, which meant we couldn’t sit on top of the lovely Azure Key Vault/secrets.json workflow, and instead used a secure file drop as part of the deployment. We had a couple of releases that failed because secrets that were available locally weren’t included in the release package, causing the website to fail to start. We didn’t have the nice logging provided by the .Net core configuration objects, so what we did was create configuration documentation in the form on an example.config file that contained all the keys we expected, and populated dummy values. On startup, we’d compare the list of keys in the example.config which did not contain any secrets, with the actual config, and fail the startup with a report of which keys were different so we had a very quick way of checking the config was in sync with expectations.