Categories
.net code data development

CosmosDb in The Real World : Azure Global Bootcamp 2019 (Glasgow)

Thank you to those who came to my talk today about CosmosDb. I hope you found it useful.

If you’d like to review the slides, you’ll find the presentation online here :

CosmosDb In The Real World – GitPitch

If you have any further questions please ask below and I’ll do my best to answer.

Categories
development programming security

Litterboxing

Sandboxing is a great idea. Isolate your processes. Isolate your data. Chinese Walls everywhere to ensure everything is kept separate, independent and secure. Whether that’s containers or SELinux

Litterbox – it’s like a sandbox but when you look closer it’s full of shit

“There’s the inkling of a good idea in there. He’s almost describing a sandbox (without, of course, any of the isolation required) and front-end microservices (without a pipeline, or components). Can we call this pattern litterboxing? It’s a sandbox until you look closer and see it’s full of catnip.”

https://thedailywtf.com/articles/comments/classic-wtf-i-am-right-and-the-entire-industry-is-wrong/1#comment-506416

Categories
cloud development programming

Cloud thinking : Executable documentation.

Documentation is just comments in a separate file. At least developers can see the comments when they change code. Tests are better comments. Tests know when they don’t match the code.

Infrastructure is the same. I can write a checklist to set up an environment, and write an architecture diagram to define it, but as soon as I change something in production, it’s out of date, unless it’s very high level, and therefore only useful to provide an outline, not detail.

Unit tests document code, acceptance tests document requirements, code analysis documents style and readability, and desired-state-configuration documents infrastructure. All of them can be checked automatically every time you commit.

Documentation as code means the documentation is executable. It doesn’t always mean it’s human readable; ARM templates in particular can be impenetrable at times. If machines understand it, the documentation can be tested continuously, repeated endlessly across multiple environments, reconfigured and redeployed at the stroke of a keyboard.

The more human you have in a process, the more opportunities for human error. It doesn’t remove mistakes, but it’s much easier to stop the same mistake happening twice.

Categories
development programming

Modular doesn’t mean interchangeable

Microservices are great. Containers are great. Packages are great. Objects are great. Functions are great. At each level they encapsulate functionality into re-usable modules. One place to make changes. One place to optimise. One abstraction to help you move to the best layer.

For the sake of argument though, I’m going to focus on microservices, specifically those accessed via an API, as that’s what I know best.

Interfaces and APIs are good for test harnesses, but isolating systems for your own benefit (to make modules open for extension but closed for change) bears no relation to making that module useful to other systems.

Sure, I could rip out the modules that contain business logic, and package it as a framework, but that doesn’t mean it’s useful for anyone else. It could well be an inner system . It may still be coupled to domain/architectural knowledge that make it too valuable to swap out. There could be an implicit dependency for just that one version of that one tool.

It is of course possible to make modules re-usable, but they have to be designed that way, whether up front, or evolved. And anything interchangeable has to be tested, in at least 2 conditions, to make sure it’s suitably robust.

Design an interface. Evolve the design. But don’t mistake solving one problem in one place for a general solution. And don’t think that adding an interface is all it takes to turn a local SQLite data store into a cloud file system repository.

Categories
development

Blank slate

I’m a fan of the original Sherlock Holmes books, and there are a few things in them that talk about an enlightenment way of thinking that’s useful for a developer when gathering requirements (whether tomes, user stories, backlog ideas,etc), writing code or approaching debugging.

Today I want to talk a bit about A Study in Scarlet, and the early conversations between Holmes and Watson before their first case together, because they capture the essence of his enlightenment thinking very well.

Ignore Irrelevant detail

Watson tells Holmes that the sun rotates around the earth, as he is surprised that an educated man does not know this. In reply, Holmes says,
“That’s very interesting, I will now do my best to forget it”

It’s not an important factor in any of his cases, it doesn’t affect his work, so it’s not a fact he needs to record. He accepts it but has no way to act on that information, so he dismisses it.

It’s something that is often hard to do in requirements gathering and may need to be done retrospectively after you find out whether Andy from finance’s love for Richard Branson is just infatuation, or it means that every call to action across your site will need to be red.

It’s also something to apply more generally, choose what to ignore, don’t learn every new JS framework. Don’t expect to be an expert in everything. Be content with being T-shaped, and become an expert in solving problems and focusing on the right details.

The Book of Life

” it attempted to show how much an observant man might learn by an accurate and systematic examination of all that came in his way.”

The truth is never simple. Whatever you build will be part of an ecosystem comprised of other software, of manual processes, of fallible humans, and the winds of fate.

Gather whatever information you can, in whatever detail you can. Organise it and understand the bigger context.

The Phoenix Project has a great understanding of this in the discussion of the SOX-404 audit, where the IT department are busy worrying about controls on software without understanding the manual processes surrounding the software and how that fits into the compliance picture.

Speculation

On their journey to their first case together, Watson is keen to speculate about the motives and means that led to the crime, extrapolating from what little information they got from the initial introduction.

Holmes quickly and sternly cut Watson off from that line of thinking. He was clear that he would also base hypotheses on the causes and consequences once he had evidence before him to narrow down the possibilities, removing the impossible.

Speculate make hypotheses on why the system failed, on what will improve sales, on what the performance bottleneck is really going to be. Without data or a way to test them, they are useless for reaching understanding. Sometimes the beauty of a forest can only be appreciated by the birds.

Remove your preconceptions and bias from judgement by understanding what they are. Follow the data instead of your gut.

Categories
development

When the old guard are not heroes

When I started programming, there were some key figures identified as the experts. They wrote the books that laid out how my code should look, how I should behave, and what the purpose of a software developer was. And there’s a new guard, also steeped in that mythos, publishing their hot takes and simple books.

I looked to my peers and seniors for guidance, and they looked to the agile manifesto, the gang of four and others. Those who packaged up existing ideas and best practices, set against a growing trend of Taylorism, and wrapped themselves in a punk ethos fighting the power, wielding keyboards like axes.

And there was value in what they were selling, small pieces that fed my hunger for “something better”.

But for all the bravado, the facade of Woody Guthrie and Bruce Springsteen they presented, deep down they were the establishment, tweaking the system slightly for their own benefit, re-engaging the bro culture of the “frontier spirit” of the early internet. Rules and processes are dictated by “the man” to restrict our freedom to innovate, and to work in the best way.

Companies formed and rose on the counter-culture ethic, buoyed by the vacuum left as union power fell, blinding us all that tech was the new utopia, and it brought a meritocracy that would bring equality.

And the gods saw agile, and it was good.

We moved fast and broke stuff, re-wrote the rules. The industry didn’t take time to consider why those rules existed, just that they were roadblocks.

So when women and people of colour, and anyone else who wasn’t a bro, stepped into the arena, they were vulnerable. Because there were still rules, just no process. And the rules aren’t made for them. And the gatekeepers, clear in their self-image that they weren’t racist, or sexist, and only discriminated on merit, could not comprehend that processes and written rules exist to limit privilege, because equality means nothing without equity.

They asked all of us, the squirrels, the rhinos, the fish, and the peacocks, to climb trees and collect nuts to demonstrate our worthiness to join their club, without regard for our skills or our backgrounds. They bully, they fight, they protect their own, because rules weigh us down, man.


  • People over process, so long as the people are the right people.
  • Working software, so long as it works for me. Anything else is externalities and not my problem.
  • Collaboration over contracts, we’re all on the same team, so forgive me as I forgive you.
  • Respond to change over following a plan, but is there a vision that informs what changes are acceptable?

Agile still has its place, but what would it look like if the manifesto and the guiding principles had been laid out by a more representative group? The antecedents, stretching right back through NASA, Bletchly Park and The Difference Engine, remain the same, and the key lessons are independent of those teachers, but what did we miss by not having others at the table?

  • What does process that protects people look like? Process that keeps people safe and secure to deliver the best solution?
  • What does software that works for everyone look like? Beyond documentation on accessibility and anti-discrimination.
  • What does proper, honest, compassionate collaboration look like? Within teams, between teams and across disciplines?
  • What does meaningful, people-centered change look like?

Software is too important to be left to the swamp that is this libertarian mess. The current ways are driving great developers out of tech, and users are not fairly represented. Some of the old guard recognise this, but many don’t, and they have plenty of followers who still believe the meritocracy myth.

There is no level playing field in tech.

Not everyone has spare time for technical exams outside work hours. Not everyone is comfortable pair programming. Not everyone can follow a spoken stand-up. Not everyone feels safe to bring their full self to work. Everyone is not treated the same. Not everyone can have alcohol and pizza. Not everyone has the experience of standing up to authority figures. Not everyone can be themselves without being judged or mocked by the cis straight white male gatekeepers, and their supporters.

There’s precious few unions and the biggest tech companies are all struggling with human rights, fair employment and treating their users fairly. And it’s not just them.

Unlike the 20th century, it’s not a battle between management and worker. Agile has cast management and worker in the same side, but some workers are far more equal than others.

People are not supported by the process. Software doesn’t work for everyone. Collaboration is limited because contracts don’t protect and promote safety. The plan for meritocracy to end discrimination isn’t working. How are you going to change it?

Categories
development leadership

Get uncomfortable

This blog is about technical topics and being a technical lead. Understanding architecture is part of it, but if you don’t reflect on understanding people then you will never be a leader. I don’t have all the answers but it starts with accepting you can learn from others, and some of what they share will not fit with your current world view. Be ready to tear down and rebuild some foundations if you’re a technical person willing to become a leader. In the spirit of learning I’ve included the original tweets at the bottom for context so you can see the rest of the conversation.

Data needs the right questions

I trust science, so data beats anecdotes, but once I started to listen to people’s stories, I realized the people collecting the data weren’t asking important questions. “How many women applied for this job?” “How many targets of violent crime are gay?” “Are black men targeted by stop and search?” Some of these questions now have better answers, but there’s still a lot of questions that don’t even occur to people to ask.

I learned a lot from my wife in that respect – the importance of qualitative research to make sure you’re asking the right questions. But qualitative is “soft science” so isn’t as well respected, despite being fundamental to getting it right. Sound familiar?

And then you have to ask why the data isn’t collected? Is it a blind spot, or do people earnestly not want to know because then they’d have to face up to uncomfortable truths that their image of themselves does not match how others see them? Our egos are fragile, which is why I have to work hard and compassionately with new developers to understand ego-less code and collective ownership. Vulnerability is hard, especially for men in tech, and that manifests itself in many defensive micro-agressions.

I’m not going to talk about toxic masculinity here, but please go watch The Mask You Live In to hear a US perspective on how “manning up” is creating a toxic environment.

Code needs the right questions

Yeah, code is for solving the problems you know about, but how do you solve the problems you don’t know about?

If someone calls you a snowflake, or an sjw, just for asking a question, there’s a very important reason they don’t want you to ask that question : they’re scared of the answer.

Let’s be clear, science and data matter, otherwise the opinion of some white guy who can’t keep his job at Google is worth as much as someone who has collected, reviewed and summarised all the data. But we all need to be sure that the right data is collected and the right questions are asked.

To be honest, I started down this path because I saw my friends were hurting, and they helped me understand homophobia, and then I started seeing where everyone else was disadvantaged. Having a friend to guide you is the best way to open your eyes.

Why should you change? Because “we’ve always done it this way” is the worst justification for anything. Because if you find out something is broken, it should make you uncomfortable. And if you think nothing is broken then you shouldn’t be writing software and fixing problems.

Make space

This isn’t about feelings, or political correctness or any of that. This is about you doing your job, understanding the domain that the technology you create sits within. It’s about bringing your full self to work, and making sure everyone else on the team has that opportunity too. And if they don’t want to talk about what they did at the weekend, that’s their choice too.

If you can’t make space to accept that other points of view are valid, that technology mediates access and knowledge, and your code will directly impact someone’s ability to access that, you should not be in this industry. Make space for someone who gives a shit about the users, and the wider community affected by every decision you make in every line of code you write and review, and every interaction associated with that code.

Is it exhausting? It can be. Especially at first. But you know what’s really exhausting? Fighting… technology…

Every

Step

Of

The

….

Way.

Don’t accept the status quo

Don’t be the developer that makes their colleagues rage quit. Or makes the users curse their every day stuck because you didn’t ask the right questions.

Did you test your facial recognition on black faces?

Can blind people order pizza on your website?

Is every woman on your team made of regular polygons and has regular periods?

Question everything. The truth is out there if you care to look. Other people should not be alien to you.


Categories
code cosmosdb data development

Cosmosdb and Heterogeneous data

A selection of different watches. They all tell the time, but some are analogues, some are digital, some are branded, and some are not.
Same, but different

CosmosDb, in common with other NoSQL databases, is schema-free. In other words, it doesn’t validate incoming data by default. This is a feature, not a bug. But it’s a dramatic change in thinking, akin to moving to a dynamically typed language from a statically typed one (and not, as it might first appear, moving from a strongly typed to a weakly typed one).

For those of us coming from a SQL or OO background, it’s tempting to use objects, possibly nested, to represent and validate the data, and hence encourage all the data within a collection to have the same structure (give or take some optional fields). This works, but it doesn’t provide all the benefits of moving away from a structured database. And it inherits from classic ORMs the migration problem when the objects and schema need to change. It can very easily lead to a fragile big-bang deployment.

For those of us used to dynamic languages and are comfortable with Python’s duck typing or the optional-by-default sparse mapping required to use continuously-versioned JSON-based RESTful services, there’s an obvious alternative. Be generous in what you accept.

If I have a smart home, packed with sensors, I could create a subset of core data with time, sensor identifier and a warning flag. So long as the website knows if that identifier is a smoke alarm or a thermostat, it can alert the user appropriately. But on top of that, the smoke alarm can store particle count, battery level, mains power status, a flag for test mode enabled, and the thermostat can have a temperature value, current programme state, boiler status, etc, both tied into the same stream.

Why would I want to do this?

Versioning

Have historic and current data from a device/user in one place, recorded accurately as how it was delivered (so that you can tweak the algorithm to fix that timedrift bug) rather than having to reformat all your historical data when you know only a small subset will ever be read again.

Data siblings

Take all the similar data together for unified analysis – such as multiple thermostat models with the same base properties but different configurations. This allows you to generate a temperature trend across devices, even as the sensors change, if sensors are all from different manufacturers, and across anything with a temperature sensor.

Co-location

If you’re making good use of cosmosdb partitions you may want to keep certain data within a partition to optimise queries. For example, a customer, all of their devices, and aggregated summaries of their activity. You can do this by partitioning on the customer id, and collecting the different types of data into one collection.

Conclusion

NoSQL is not 3NF, so throw put those textbooks and start thinking of data as more dynamic and freeform. You can still enforce structure if you want to, but think about if you’re causing yourself pain further down the road.

Check out @craignicol’s Tweet: https://twitter.com/craignicol/status/1122224379658633217?s=09

Categories
development

CodeCraftConf 2019 : What is data anyway? (Answers)

Here’s my thoughts on data following my CodeCraftConf guided conversation. Here are the questions I asked during my guided conversation at CodeCraftConf 2019. They are also available on GitHub if you would like to fork and modify them for your own use.

Most developers are data driven, start with the data structure, not the algorithm. Either data driven design, or the Merise Methodology.

Data, whilst often divided by microservice, is often stored on the same server/cluster, creating a monolith behind the microservices.

Not all data access is secured and audited, although there does appear to be a trend to on-behalf-of flows through the microservice, allowing user-centered access control. Strict data access design is prevalent, although the efficacy was less clear, and strict design applies to all data, including publicly available data.

Keeping sight of data in distributed systems is hard. Jepsen was suggested as one resource to help, but I’m happy to hear of others.

As well as data that can be used to discriminate by collecting gender, name, postcode etc., we also discussed how missing data can be used to discriminate, such as when Glasgow accents aren’t included in voice training data, or when women aren’t used in medical trails.

There’s also the big and growing problem of data collected by people who do not consider the discrimination or privacy implications. For a biologist, DNA is a puzzle that helps them decode cancer, and more examples make the puzzle easier to solve. But for others, DNA is a tool to map insurance risk, to find criminals, and to track down family members whether or not they want to be found. How do we train everyone else to understand?

And the takeaway question : what questions aren’t you asking about your data?

Categories
data development

CodeCraftConf 2019 : What is data anyway? (Questions)

Here are the questions I asked during my guided conversation at CodeCraftConf 2019. They are also available on GitHub if you would like to fork and modify them for your own use. Thankyou to everyone who came to the discussion, I will post a follow-up to discuss some of the interesting answers.

What is data anyway?

Navigating SQL, NoSQL, JSON and how to work with data in a post-RDMS, big-data world

Questions

Data modelling

  1. When designing a system, do you start with the data or the code?
  2. Has the rise of cloud based or non relational data stores changed how we model our data?
  3. Do you need to update your data when the models in the code change? How do you do it?
  4. Does all your data have to have the same shape?
  5. Should the data you expose to the outside world broadly match the data at rest?

Data security

  1. How do you secure your data?
  2. In light of GDPR, How do you ensure you aren’t collecting too much data?
  3. Who has access to your data?
  • Do you know if anyone unauthorised has accessed it?
  1. How do you protect yourself against bad data and trojan data?
  • Bad data = data that is fake, or is used for real world attacks
  • Trojan data = data that can compromise your or your customer’s systems

Ethical data

  1. Can your data be used to discriminate?
  • Can you prove it?
  • Is your data biased?
  • Are you recording hidden correlations? (ZIP code suggests race)
  1. Who owns your data?
  2. What questions aren’t you asking?

Unused questions

  1. What makes data big?
  2. Are you collecting the right data?
  3. Is the data you’re collecting right?
  4. Where is your data?

Technology choices

  1. Do you still have a place for traditional RDBMS?
Categories
development leadership

Unsuccessful Teams

Are you making space?

Game Outcomes

In a previous post, I looked at how to create successful teams, and looked at the Game Outcomes project as a useful formulation.

Some of these points are about avoiding negatives and that’s what I want to focus on here.

The most important indicators for success from the Game Outcomes project are:

  1. Great game development teams have a clear, shared vision of the game design and the development plan and an infectious enthusiasm for that vision.
  2. Great game development teams carefully manage the risks to the design vision and the development plan.
  3. Members of great game development teams buy into the decisions that are made.
  4. Great game development teams avoid crunch (overtime).
  5. Great gamedev teams build an environment where it’s safe to take a risk and stick your neck out to say what needs to be said.
  6. Great gamedev teams do everything they can to minimize turnover and avoid changing the team composition except for growing it when needed. This includes avoiding disruptive re-organizations as much as possible.
  7. Great gamedev teams resolve interpersonal conflicts swiftly and professionally.
  8. Great gamedev teams have a clearly-defined mission statement and/or set of values, which they genuinely buy into and believe in. This matters FAR more than you might think.
  9. Great gamedev teams keep the feedback loop going strong. No one should go too long without receiving feedback on their work.
  10. Great gamedev teams celebrate novel ideas, even if they don’t achieve their intended result. All team members need the freedom to fail, especially creative ones.

Confounding factors

Overtime and crunch

Deadlines are good, to a point. It helps focus. With a clear goal and a timebox it’s much easier to discard sandbags and maintain motivation. Many personal productivity schemes rely on setting yourself deadlines.

When those deadlines are too restrictive however the product will suffer. Teams will work late and produce lower quality work. They will cut corners. If time is fixed then either scope or quality or both need to be cut.

Unplanned or persistent overtime is a critical bug and needs to be prioritized as such. There’s no such thing as completely bug-free, but you should always be aiming for zero.

Mono-cultures and silos

Cross-functional teams make better decisions faster. That’s a lesson I learned the hard way. The consumer of the API should sit in the same room as the producer. Even better, at the same desk, or in the same chair.

It’s not just technology silos that cause problems. If your team is a straight cis able-bodied English-speaking white male silo, or functionally equivalent to one, then it will fail at every interface with someone outside that group. Widening your team doesn’t stop those failures, but if you manage the team properly, the failures are fixed within the team (with the goal of fixing them before the code is written) rather than experienced by consumers.

Diverse teams are also more creative.

Inter-personal conflict

Teams don’t shy away from conflict. Open discussion, even a heated one, clears the air rather than letting micro-vexations and micro-aggressions become the norm and harm the team, one papercut at a time.

Successful teams solve disagreements in the open. People first, then process after, to remind everyone of decisions made.

Punishment driven development

Feedback is great. Tracking progress is useful. But please be sure you are tracking the right thing.

I can’t say this any better than Louise Elliot, who talks about all the ways measuring the wrong thing can seriously affect a team. Video is below, but you can also listen to her talking about Punishment Driven Development on the .Net Rocks podcast, if that’s more your style.

Are you unsuccessful?

What dysfunctions have you seen in your teams now or in the past? How have you fixed them, or how will you?