Categories
code cosmosdb data development

Cosmosdb and Heterogeneous data

A selection of different watches. They all tell the time, but some are analogues, some are digital, some are branded, and some are not.
Same, but different

CosmosDb, in common with other NoSQL databases, is schema-free. In other words, it doesn’t validate incoming data by default. This is a feature, not a bug. But it’s a dramatic change in thinking, akin to moving to a dynamically typed language from a statically typed one (and not, as it might first appear, moving from a strongly typed to a weakly typed one).

For those of us coming from a SQL or OO background, it’s tempting to use objects, possibly nested, to represent and validate the data, and hence encourage all the data within a collection to have the same structure (give or take some optional fields). This works, but it doesn’t provide all the benefits of moving away from a structured database. And it inherits from classic ORMs the migration problem when the objects and schema need to change. It can very easily lead to a fragile big-bang deployment.

For those of us used to dynamic languages and are comfortable with Python’s duck typing or the optional-by-default sparse mapping required to use continuously-versioned JSON-based RESTful services, there’s an obvious alternative. Be generous in what you accept.

If I have a smart home, packed with sensors, I could create a subset of core data with time, sensor identifier and a warning flag. So long as the website knows if that identifier is a smoke alarm or a thermostat, it can alert the user appropriately. But on top of that, the smoke alarm can store particle count, battery level, mains power status, a flag for test mode enabled, and the thermostat can have a temperature value, current programme state, boiler status, etc, both tied into the same stream.

Why would I want to do this?

Versioning

Have historic and current data from a device/user in one place, recorded accurately as how it was delivered (so that you can tweak the algorithm to fix that timedrift bug) rather than having to reformat all your historical data when you know only a small subset will ever be read again.

Data siblings

Take all the similar data together for unified analysis – such as multiple thermostat models with the same base properties but different configurations. This allows you to generate a temperature trend across devices, even as the sensors change, if sensors are all from different manufacturers, and across anything with a temperature sensor.

Co-location

If you’re making good use of cosmosdb partitions you may want to keep certain data within a partition to optimise queries. For example, a customer, all of their devices, and aggregated summaries of their activity. You can do this by partitioning on the customer id, and collecting the different types of data into one collection.

Conclusion

NoSQL is not 3NF, so throw put those textbooks and start thinking of data as more dynamic and freeform. You can still enforce structure if you want to, but think about if you’re causing yourself pain further down the road.

Check out @craignicol’s Tweet: https://twitter.com/craignicol/status/1122224379658633217?s=09

Advertisement
Categories
development

CodeCraftConf 2019 : What is data anyway? (Answers)

Here’s my thoughts on data following my CodeCraftConf guided conversation. Here are the questions I asked during my guided conversation at CodeCraftConf 2019. They are also available on GitHub if you would like to fork and modify them for your own use.

Most developers are data driven, start with the data structure, not the algorithm. Either data driven design, or the Merise Methodology.

Data, whilst often divided by microservice, is often stored on the same server/cluster, creating a monolith behind the microservices.

Not all data access is secured and audited, although there does appear to be a trend to on-behalf-of flows through the microservice, allowing user-centered access control. Strict data access design is prevalent, although the efficacy was less clear, and strict design applies to all data, including publicly available data.

Keeping sight of data in distributed systems is hard. Jepsen was suggested as one resource to help, but I’m happy to hear of others.

As well as data that can be used to discriminate by collecting gender, name, postcode etc., we also discussed how missing data can be used to discriminate, such as when Glasgow accents aren’t included in voice training data, or when women aren’t used in medical trails.

There’s also the big and growing problem of data collected by people who do not consider the discrimination or privacy implications. For a biologist, DNA is a puzzle that helps them decode cancer, and more examples make the puzzle easier to solve. But for others, DNA is a tool to map insurance risk, to find criminals, and to track down family members whether or not they want to be found. How do we train everyone else to understand?

And the takeaway question : what questions aren’t you asking about your data?

Categories
data development

CodeCraftConf 2019 : What is data anyway? (Questions)

Here are the questions I asked during my guided conversation at CodeCraftConf 2019. They are also available on GitHub if you would like to fork and modify them for your own use. Thankyou to everyone who came to the discussion, I will post a follow-up to discuss some of the interesting answers.

What is data anyway?

Navigating SQL, NoSQL, JSON and how to work with data in a post-RDMS, big-data world

Questions

Data modelling

  1. When designing a system, do you start with the data or the code?
  2. Has the rise of cloud based or non relational data stores changed how we model our data?
  3. Do you need to update your data when the models in the code change? How do you do it?
  4. Does all your data have to have the same shape?
  5. Should the data you expose to the outside world broadly match the data at rest?

Data security

  1. How do you secure your data?
  2. In light of GDPR, How do you ensure you aren’t collecting too much data?
  3. Who has access to your data?
  • Do you know if anyone unauthorised has accessed it?
  1. How do you protect yourself against bad data and trojan data?
  • Bad data = data that is fake, or is used for real world attacks
  • Trojan data = data that can compromise your or your customer’s systems

Ethical data

  1. Can your data be used to discriminate?
  • Can you prove it?
  • Is your data biased?
  • Are you recording hidden correlations? (ZIP code suggests race)
  1. Who owns your data?
  2. What questions aren’t you asking?

Unused questions

  1. What makes data big?
  2. Are you collecting the right data?
  3. Is the data you’re collecting right?
  4. Where is your data?

Technology choices

  1. Do you still have a place for traditional RDBMS?
Categories
development leadership

Unsuccessful Teams

Are you making space?

Game Outcomes

In a previous post, I looked at how to create successful teams, and looked at the Game Outcomes project as a useful formulation.

Some of these points are about avoiding negatives and that’s what I want to focus on here.

The most important indicators for success from the Game Outcomes project are:

  1. Great game development teams have a clear, shared vision of the game design and the development plan and an infectious enthusiasm for that vision.
  2. Great game development teams carefully manage the risks to the design vision and the development plan.
  3. Members of great game development teams buy into the decisions that are made.
  4. Great game development teams avoid crunch (overtime).
  5. Great gamedev teams build an environment where it’s safe to take a risk and stick your neck out to say what needs to be said.
  6. Great gamedev teams do everything they can to minimize turnover and avoid changing the team composition except for growing it when needed. This includes avoiding disruptive re-organizations as much as possible.
  7. Great gamedev teams resolve interpersonal conflicts swiftly and professionally.
  8. Great gamedev teams have a clearly-defined mission statement and/or set of values, which they genuinely buy into and believe in. This matters FAR more than you might think.
  9. Great gamedev teams keep the feedback loop going strong. No one should go too long without receiving feedback on their work.
  10. Great gamedev teams celebrate novel ideas, even if they don’t achieve their intended result. All team members need the freedom to fail, especially creative ones.

Confounding factors

Overtime and crunch

Deadlines are good, to a point. It helps focus. With a clear goal and a timebox it’s much easier to discard sandbags and maintain motivation. Many personal productivity schemes rely on setting yourself deadlines.

When those deadlines are too restrictive however the product will suffer. Teams will work late and produce lower quality work. They will cut corners. If time is fixed then either scope or quality or both need to be cut.

Unplanned or persistent overtime is a critical bug and needs to be prioritized as such. There’s no such thing as completely bug-free, but you should always be aiming for zero.

Mono-cultures and silos

Cross-functional teams make better decisions faster. That’s a lesson I learned the hard way. The consumer of the API should sit in the same room as the producer. Even better, at the same desk, or in the same chair.

It’s not just technology silos that cause problems. If your team is a straight cis able-bodied English-speaking white male silo, or functionally equivalent to one, then it will fail at every interface with someone outside that group. Widening your team doesn’t stop those failures, but if you manage the team properly, the failures are fixed within the team (with the goal of fixing them before the code is written) rather than experienced by consumers.

Diverse teams are also more creative.

Inter-personal conflict

Teams don’t shy away from conflict. Open discussion, even a heated one, clears the air rather than letting micro-vexations and micro-aggressions become the norm and harm the team, one papercut at a time.

Successful teams solve disagreements in the open. People first, then process after, to remind everyone of decisions made.

Punishment driven development

Feedback is great. Tracking progress is useful. But please be sure you are tracking the right thing.

I can’t say this any better than Louise Elliot, who talks about all the ways measuring the wrong thing can seriously affect a team. Video is below, but you can also listen to her talking about Punishment Driven Development on the .Net Rocks podcast, if that’s more your style.

Are you unsuccessful?

What dysfunctions have you seen in your teams now or in the past? How have you fixed them, or how will you?

Categories
development google leadership

Successful teams

Successful teams deliver successful projects. As a lead, how do you build a successful team?

There are many factors to build a successful team, but the foundation of them all is safety. Can problems be discussed openly? Does everyone trust everyone else? And once you have that, the team can build. Build diversity, build towards a common goal, and build something that matters.

Successful Google team

Google defines successful teams according to its research at https://rework.withgoogle.com/blog/five-keys-to-a-successful-google-team/

Psychological safety: Can we take risks on this team without feeling insecure or embarrassed?
Dependability: Can we count on each other to do high quality work on time?
Structure & clarity: Are goals, roles, and execution plans on our team clear?
Meaning of work: Are we working on something that is personally important for each of us?
Impact of work: Do we fundamentally believe that the work we’re doing matters?

I accept, given multiple ongoing accusations against them about defending toxic managers and culture, that Google may not be living these values. However, these are clear statements that are supported by other studies such as The Game Outcomes project.

Penguins at Edinburgh Zoo

So how do we build a team like that?

Number one thing, and the only way I’ve found success, is to empower the team and everyone within it to make changes. Without that ownership, nothing matters.

Once you have that, you as the leader have to own the rest. Delegate where you need to, but own your team’s safety, support, direction, purpose and motivation.

Safety

Are you free to take risks and try something new?

Not everything you do will be a success, so do you celebrate knowledge and learning as a goal? Yes, that cost us time and money, but we learned not to do that again

Are team members supported? When someone mansplains your tech lead, do you correct them, and ensure her voice is heard? When a deaf developer joins the team, do you ask whether they prefer lip reading or sign, and help the team adapt appropriately? Do you recognise colour? Do you use preferred pronouns?

When mistakes are made, do you find someone to blame or do you all accept responsibility to address it? If the production database can be deleted by the graduate on their first day, and there are no backups, that is never their fault.

Creating Psychological Safety in the Workplace https://hbr.org/ideacast/2019/01/creating-psychological-safety-in-the-workplace

High Quality

Do you always have an high standard?

Everyone has their code reviewed, especially the lead. Is every line of code, and every process open to review and improvement? Great that you’re agile, but if you really value people over process, write the process down, and follow it. It doesn’t mean no process, it means that process serves the people, not the other way around. It means you change it when it no longer supports the people or the product.

What are your quality standards for code, for user experience, for security, and most importantly for behaviour? How are they enforced? And are they always enforced on time, every time?

Have policies. Do not have a daily fight over tabs vs spaces.

Direction

Ask everyone on your team what the team is building. If you get more than one answer, that’s a bug.

Ask everyone which part everyone else on the team plays towards that. Does that match how they see their role? Are there any gaps in responsibility?

Ask everyone what their priority is and why. Is anyone blocked? Ask them what their next priority is and if they have everything they need to fulfil it. If not, do they know where to get it?

Purpose

Is everyone bringing their whole self to work? Do office politics make them wary? Are they in a marginalized group and they have to bring representation as well as talent, and they are having to do both jobs at once?

At the office, is this the number one thing for them to be doing? Are your developers feeling stuck in support or BA? Are they frustrated that they aren’t allowed to refactor a gnarly piece of code that’s very open to be improved because “it works, don’t touch it”.

Does everyone on the team feel empowered to speak up and to fix things where they interfere with the goal of the team?

Motivation

Ask everyone why the team is building what they’re building, and why their part is important.

How will this change the user’s day? How will it affect the company? What’s the net improvement?

The Game Outcomes formulation

If you don’t like the Google formulation, try the game outcomes one. There’s plenty that applies to non-game projects. There’s a few negatives to avoid, and I’ll revisit them in a later post.

The most important indicators for success from the Game Outcomes project are:

  1. Great game development teams have a clear, shared vision of the game design and the development plan and an infectious enthusiasm for that vision.
  2. Great game development teams carefully manage the risks to the design vision and the development plan.
  3. Members of great game development teams buy into the decisions that are made.
  4. Great game development teams avoid crunch (overtime).
  5. Great gamedev teams build an environment where it’s safe to take a risk and stick your neck out to say what needs to be said.
  6. Great gamedev teams do everything they can to minimize turnover and avoid changing the team composition except for growing it when needed. This includes avoiding disruptive re-organizations as much as possible.
  7. Great gamedev teams resolve interpersonal conflicts swiftly and professionally.
  8. Great gamedev teams have a clearly-defined mission statement and/or set of values, which they genuinely buy into and believe in. This matters FAR more than you might think.
  9. Great gamedev teams keep the feedback loop going strong. No one should go too long without receiving feedback on their work.
  10. Great gamedev teams celebrate novel ideas, even if they don’t achieve their intended result. All team members need the freedom to fail, especially creative ones.

How do you keep your team on the right path?

We all want to work on successful projects, and there’s been a couple of times in my career I’ve been lucky enough to work in a team where everyone is delivering 10x. 10x developers don’t work in isolation, they work on teams where all the above needs are met, and they thrive off each other.

It’s great to have that dream team, but start by thinking about how to make your team reliably successful, and you’ll be doing better than most software teams.

Categories
development

Not being qualified as a graduate

I’ve seen some chat recently that cis white men tend to be overconfident in their abilities, whereas everyone else is under-confident, so those men get higher salaries, faster promotions and apply for jobs that others wouldn’t. I agree, but I also wonder how much of the impact is down to a nepotism that favours “culture fit” (such a horrible term), and how much is due to megalomaniac outliers (think any famous Silicon Valley founder) that have so much confidence they pull others into their wake.

I don’t doubt that I’ve benefited from privilege on either count, but as both ideas are equally alien to me, I am still contemplating which foundation needs crumble to correct the imbalance.

I’ve interviewed some of the overconfident. I didn’t hire them because they couldn’t demonstrate ability to match their ego, and couldn’t fit in any team. I’ve interviewed the under-confident too, and their abilities outshone their ego. I was happy to champion them when others were unsure and my decision was validated by results.

I’ve interviewed enough people to know this is an issue, but as someone who had imposter syndrome well into my first 2 jobs, I thought it might help others to hear about my start, although I accept I will have had far fewer negative encounters as I was learning the trade by virtue of how I look and sound.

The most important person in this story is now my wife. I’d definitely recommend have a champion on your side. Someone who will push you when you have doubts. Whether it’s a friend, a family member, a mentor or a recruitment agent you can trust. Someone who will encourage you to apply for that job when you only meet 30% of the criteria, who will challenge you to be your best, and point out your flaws so you can work on them.

Graduation

Looking for jobs after university was a depressing experience. The y2k surge had passed, the .com boom was bust. I sent off CVs to what seemed like every company hiring tech staff in Scotland (I spoke to IBM and Microsoft but they only had sales jobs). I had a few interviews but every company I interviewed for announced redundancies within 6 months. One guy in my class was offered a job, and flown to the US for training, and the job was taken away whilst he was in the air, before his contract officially started.

It was a rubbish time for jobs, unless I wanted to follow my classmates to London : jobs in banks with a 70-hour week, paying far less per hour than Scotland, and with higher living costs.

I was demoralized, and starting to wonder if I’d chosen the right career. It didn’t help that the “job board” on the AI lab only had one “poster”, for a burger flipper at McDonald’s.

Another option

My wife was studying for a PhD, and found an opportunity for me at Glasgow Uni. It was something I had thought about, but I hadn’t realized there was money available to study for one. It was in audio interfaces, which allowed me to combine my love of programming with a love of music, and I picked up a lot of new skills along the way : MFC; Win32; Matlab; C++; parsing to write my own DSL; data science – feature detection, non-relational temporal data.

A PhD is hard work, so major respect to anyone else who’s achieved one. It’s not for everyone but I thoroughly enjoyed the experience. It was definitely not was I was looking for when I was job hunting, but having someone throwing left field ideas at me helped me both to understand what I wanted, and to widen my horizon of opportunities to apply for.

Don’t be fooled into thinking “a tech job” is a software developer at a multi-national tech company. Be a tester, and architect, work in operations, or security, or User Experience, work in academia or government.