This is the second post in this series, which refers back to the first part quite a bit, so I'd highly recommend going back and reading the first part if you haven't already. You can find it on either the tails.com tech blog or on designingoverload.
This is also a repost of the original article on the tails tech blog.
In the last post we talked about Phase 4, where things got blurry and slow. This demonstrated the need for infrastructure and systems - and specifically the need for Data Engineering. The role of a sole data engineer in an organisation is a lonely one and will see the workload in this area rapidly scale (there's an excellent article by Maxime Beauchemin about the rise of data engineering that's well worth reading to get some background on this).
At first, data engineering is a bonus, a way for analysts to be more productive. But very soon as you scale, and you start building layers on layers, it doesn't just becomes a nice to have, it becomes a critical piece of infrastructure.
Phase 5: Things just got serious
Look around: Your data organisation probably has at least three people in it. You've probably got tools and systems for dashboards and reporting that the rest of the organisation actually rely on to be able to do their jobs. If you've done your job right, all the evangelising you've been doing about the use of data and how to make decisions has paid off - and people now require data in a way that means when it's not there - it's a serious business problem. There's a lot to coordinate here and it's worth thinking about how it should all work together.
Robustness is now a major concern in a way that it hasn't been before. Robustness of infrastructure, robustness of data and robustness of insight. Let's deal with the three of these one by one.
Robustness of Infrastructure
This it the obvious one when people talk about reliability. That stuff you built - it's got to work. If you're using tools open source like Airflow or Superset (as we have done at tails.com) this is where it's worth getting someone with experience of running production infrastructure to look at your deployment and get it set up properly.
Infrastructure that you rely on for your business to function is production infrastructure - even if your customer never sees it or interacts with it.
You might also consider investing in BI tools like Tableau, Power BI or Looker and investing in hosted infrastructure solutions like Snowflake, Stitch or dbt. Remember, all the time that you spend fixing your infrastructure is time that your team isn't driving the organisation forward and time that the rest of the organisation could be running blind. There is a real cost to that downtime, and that cost could warrant solutions that up until now had been prohibitively expensive.
Robustness of Insight
Even with reliable infrastructure you can still get unreliable results if the logic applied to the data is flawed. Formalising analysis and turning it into code (and therefore into data as I'll discuss in a second) allows you to apply the principles of code review and testing to it, but that will only ever be a subset of the analysis you do. Beyond that, establishing process for analytics (which can be as simple as light-touch peer review) and some guidelines for how to do good analysis will apply some governance to the outputs that the team are creating. The goal of both is simply to ensure quality as the number of people doing analytics gets larger. As analysis becomes standard however, it all comes down to data.
Robustness of Data
This is less obvious than just straight up downtime, and much more harmful. Just because raw data is there, that doesn't mean it's correct.
No data is better than bad data...
The effect of bad data is that people will begin to believe that things are actually different from the way they are, and rather than remaining uncertain, will become more certain of potentially the wrong path.
There's an excellent follow on post from the Maxime Beauchemin which elaborates:
Whether you think that old-school data warehousing concepts are fading or not, the quest to achieve conformed dimensions and conformed metrics is as relevant as it ever was. Most of us still hear people saying “single source of truth” every other day. The data warehouse needs to reflect the business, and the business should have clarity on how it thinks about analytics. Conflicting nomenclature and inconsistent data across different namespaces, or “data marts” are problematic. If you want to build trust in a way that supports decision-making, you need a minimum of consistency and alignment. In modern large organisations, where hundreds of people are involved in the data generation side of the analytical process, consensus seeking is challenging, when not outright impossible in a timely fashion.
The bottom line is that we need to find ways to make sure that not only is the data available but that it can be relied upon to make decisions. To do this we can borrow a couple of approaches from the field of software engineering to help us.
- Testing. Not in the sense of an AB test, but in the sense of a validation or consistency check to confirm that things work and look as they should. At tails.com we have a handful of tests which are applied to every hourly refresh of our core numbers. Great open source tools for this are dbt and Great Expectations.
- Version Control. The goal of version control is to allow multiple people to edit something without treading on each-others toes. It has the added benefit of allowing you to keep track of who made what change and with the ability to roll back to a previous version if required. Tools like Airflow or dbt are built around version control and it would be madness to use them without it, but even some more "enterprise" software tools like Looker use it too. Some newer entrants into this space including organisations like Kleene.ai.
- Code review. Increasingly we're finding our analytics logic codified and written down. Not in words but in SQL files or python files. These are the definitions that drive your business and so it makes sense to put some care into how they're written. In particular for any PhDs you may find on your data science team, the principles of peer-review should feel familiar, and they can often be your best cheerleaders for this as a practice. The easiest way to do this can be to take a leaf from software engineering and use github pull requests, but ultimately the tool is not important here - it's the people process.
We touched on the subject of money, and how the trade-offs around your analytics infrastructure now mean that you would rather trade cash for features, rather than features for cash. Your first investment will need to be carefully chosen, if for no other reason than it's your first major data investment and your finance team might not understand the ROI or see why you need it. After that first one however (assuming it was a success), that leads us into the next and potentially most dangerous phase...
Phase 6: Spending Money
It's well worth noting that we've got through phases 1-5 without spending a lot of cash. This is intentional, because up until this point, the output of your analytics function, and the extent to which it is part of day to day decision making hasn't warranted that kind of investment. The huge number of external providers of data tools might have you believe that spending some of your hard earned cash is step 1 - resist that urge.
It's also worth noting that the very beginning of this phase, is probably one of the times that you'll deploy money the most efficiently. When cash is tight, and the bar for investment is high, you'll make some of your best decisions. Not only will the need to justify your investments protect you from making poor ones (to some extent), it will also ensure that you regularly review what you have spent and pull the plug if it's not living up to what you need, and keep your amount of legacy tech at bay.
Broadly speaking there are three categories of things you might need to spend money on:
- Storage & compute. As we covered in phase 4 (things got slow), storage and compute makes a big difference to the speed you can perform analytics. By this stage, you'll have a good idea of what you want to optimise for and keep that clearly in mind. In the more modern end of this spectrum tools like BigQuery, Snowflake and Redshift come up a lot. Many people will ask which is the best, and the answer is that it depends. If you can't articulate what you're optimising for at this stage - then go back and reconsider what your needs are before buying one. If you still can't work it out then buy the cheapest one, knowing that you'll end up migrating off it in the future when you start to find the cracks.
- ETL (Extract, Transform, Load). This kind of tool moves data around. It seems like this should be free, but unfortunately, due to the sheer number of producers and consumers of data, there's no formal standard, and writing integrations for different platforms should be something you buy and not build. Having better integrations won't help your business with better decision-making, and trying to do so will make your engineers very bored, so don't try - find the cheapest solution that works for you and get going.
- Visualisation. This is perhaps the area with the most options, and some excellent free offerings - but it's also the point that most of your stakeholders will interact with and so perhaps the biggest point of leverage for your team. The platform around those visuals might feel like fluff, but if your colleagues don't use whatever platform you choose then the visuals will have no effect. As with compute and storage, there is no single best offering here, and what works best for you will depend on the problems you're trying to solve at a given point in time. Talk to other businesses you know at your stage of growth, find out what worked for them and why - and use that to guide what you look for.
I mentioned that this was the most dangerous phase, and I strongly believe that is true, because it's very easy to start playing the infrastructure game. The infrastructure game is very fun, and feels like work, but is ultimately a distraction, and ultimately... a game.
The infrastructure game
The reason the game is so fun, is that it involves new, shiny (and sometimes expensive) toys, with bright coloured labels, that fit together in many ways. Each of those toys all have specifications and performance to compare and contrast, they have sales people who can be interrogated (and they might even buy you beer and give you free t-shirts) and they all have a compelling story about why they have the best toy to play with. Sometimes you put them together and they work, sometimes they don't; on each step of the way you learn how to put the toys together better. Ultimately, everyone will tell you how their toy (put together in the right way) will help solve all your problems.
This has been a known problem in the manufacturing industry for a long time - and is just as alluring...
We shouldn't spend money optimising the current production line, we should spend money and buy a new shiny piece of machinery which will solve all our problems!
The fallacy here is believing that the new kit will be less work, or easier. It's true that the new piece of kit might result in higher output (or better quality or less downtime etc...), but there is almost certainly overhead in your current process, and if you're not willing to invest in that, then you should really consider whether you're willing to invest the time and effort (NB: not money, because that becomes increasingly easy to spend) in making sure you get the most out of any new toy.
You should concentrate on getting in an effective dashboarding tool rather than trying to explain how your sales funnel works to other teams. The self-serve functionality is so good that it will do your job for you...
Sound too good to be true? It is. That's because...
Phase 7: The hard problems are cultural
It might feel like you got here faster than you expected for such a small business, that you have already reached the domain that only large businesses end up in - but if you've played your hand well up to this point, you should get here remarkably fast. In this phase it's not always obvious what the problems are; they're much harder to see than in previous stages and much harder to solve but the gains once they have been solved are huge. It's also worth noting that it's very easy to fall back from this phase into phase 6, which is now more better known as throwing money at the problem and hoping it will go away.
Keep calm and carry on, there are some clues to look for to identify those problems:
You find yourself in multiple conversations about whose numbers are correct.
Does your finance team disagree with the marketing team on the truth? Does your operations team disagree with itself?
- Don't use this as the prime argument for buying a shiny new marketing cloud.
- Remember you're one organisation, and that you should have one set of numbers to make decisions.
- Remind yourself that arguing over what the truth is, adds no value.
It sounds like your organisation has no trust in a centralised set of definitions and metrics, perhaps starting a conversation about building that trust might be productive?
You find yourself as an organisation doing lots of work but never really moving the needle.
At the beginning of every planning cycle, high targets are set, long road-maps are proposed, and then at the end of the cycle, not everything has been done, and the things that have been done have indeterminate results.
- Don't immediately go out and buy an experimentation platform and assume it will solve all the problems.
- Remind yourself that people can only make good plans, if they have a good idea about how their actions affect the world.
- Remember that we can only have a good idea about how future actions will affect the world if we can learn from our past actions and their results.
It sounds like it might be worth having a combined pipeline of product tests with clear hypotheses, which you then review as a team after each feature deploy to understand what effect you're having on the world and then update your plans.
There are a heap of other scenarios that we could cover, but a large number of them come down to the same two things: trust and communication. It might seem odd that a data function would end up being concerned which these topics which traditionally have been the preserve of HR or "people" functions. Reflecting on what most analysts spend their time doing, this might not be surprising however, in that analysts ask great questions, they listen and then help structure the world - much like a great coach would.
Trust can be built in one-to-one coaching style interactions, like an analyst working with a stakeholder on a particularly hairy problem. However that's only part of the communication challenge. For tails.com we found that we weren't talking about some of these challenges enough, and that meant nothing happened to solve them. One of the breakthroughs in that was to find a framework to help people piece together all of these disparate conversations, which allows everyone to put words to some of the things that they're seeing and feeling around data. For us that framework was the Data Maturity Model (developed by Carruthers & Jackson and detailed in their second book Data Driven Business Transformation).
We used this model in sessions across the organisation, starting with a description of what excellent and awful would look like on each of these segments and then having the group vote on their ranking. Each of those votes led to a discussion about what each of us were seeing to come to a consensus view of how we were doing in each area. This could then become the basis for a plan of action.
For us this also prompted a question of Strategy. In starting to facilitate discussions within teams, and also upskill them in their own data journey, it raises the question of how to frame and communicate what data is actually for.
Phase 8: What are you for?
What is truer than the truth?
We've just come from a phase of talking about culture, and in particular how it's so easy to get drawn into arguments around what is the objective truth. Sometimes these conversations are really useful and help us discover layers of meaning that we couldn't see before, but sometimes they're not, and they just become a time sink that hinders progress rather than drives it forward. I think that while data teams often feel like their job is to find the best truth, I think a much more accurate definition is that their job is to find the best meaning.
Diving even deeper into the idea of pursuing objective truth is that so often it's always "made up" on some level, built on assumption after assumption. We bring in imperfect measures of the world, make statistical calculations, assume gross simplifications, apply beliefs which are only theories rather than truths and eventually, in our best faith, provide guidance. We aim and strive for that guidance to be as close to truth as possible - but we know the two will never always be the same. If we wanted to measure ourselves against objective truth, we would need to always have the objective truth to measure against, which we don't (and if we did, what would be the point of all that work?), and so our metric of truthfulness looks like it might always be out of reach.
Even more so in our previous discussion of trust; if people see you as a provider of truth and you once make a wrong assumption, then you've damaged trust by not living up to their expectations. This also comes up in the discussion of self-serve analytics, which raises the question: If everyone can perfectly self-serve their analytics needs, if they can be their own analyst, then what are the analysts for?
The root answer to this question is nicely put by the Jewish proverb at the start of this section. Ultimately great data professionals are the ones that can extract the story from the data and give it meaning. In situations with contradictory, ambiguous and absent data points, doing that with confidence and clarity is really hard. However, that is the core skill we value in our data teams: to bring clarity, where before there was none.
The way that this is framed to be practical and useful in each organisation will be different and necessarily unique, given everyone's different starting point. I also don't think that it is static within a single organisation, but it will change over time. What I can add is perhaps my view of what I think the analytics organisation at tails.com is for, as we look ahead to the rest of 2020 and beyond.
We drive great decision making.
- How do we measure that? We measure the performance of the teams we're helping. If a team isn't performing then we're not finding a way to help them properly.
- What happens if we do some great analysis and it doesn't get used? Our job isn't to do analysis, our job is to help people make decisions to drive growth - so if it wasn't useful in helping make a decision then we did the wrong analysis.
- What happens if we're wrong? What matters is making the best decision given the information available at the time. That might still lead us down the wrong path, but we made the decision together and we'll learn from it if we get more information later. We shouldn't try to be all-knowing. Helping people understand the potential flaws in our guidance will increase trust, rather than harm it.
Phase 9 and beyond? Choices
At tails.com we've only just started to consider what comes next, and so the next blog post (if there is one), may be some time in the future. What I can give you is a rough view of what the road looks like ahead of us, of what phase 9 and beyond might look like. The challenge for us has been a bit of a head-scratcher, because for several years the next step has seemed evidently clear. The best model I've found for articulating why it's been hard is the Kano model. In particular the concept of hygiene factors and delighters.
For the first 5 years of data at tails, we've been confronted with obvious hygiene factors. The storage was too slow, we couldn't schedule things, our visualisations were poor, nobody could agree on metrics, nobody knew who could help them, we didn't have a plan (to name a few). People shouted about these things, and the challenge wasn't seeing them, but on scheduling them and working out how to execute them efficiently and quickly.
Looking forward, people are shouting less, most of the hygiene factors have been met, and that means we can move onto the delighters. The hard part about these factors, especially in a field like data which is so new, is that people can't articulate what they are - that's why they're so delighted when you achieve it.
The beauty behind an excitement attribute is to spur a potential consumer's imagination, these attributes are used to help the customer discover needs that they've never thought about before.
The added point about delighters are that you don't need all of them, arguably you should focus on a few and do them really well, but that implies choice. We need to help an organisation choose what it wants to be excellent at, what it wants to be known for. We need to help it make that choice without an existing knowledge of what it's choosing not to do.
To do this we're going to paint some futures, flesh out what the world could look like if we got really really good at some of the things we could do. What if we majored on testing and experimentation, what if we were known for hyper-personalisation, what if we were a world leader in text analytics and smart customer service? We can't be the best at all of these, but we could pick any of them to focus on.
My closing thought is that what we've discussed in phases 7, 8 & 9, while in the context of data and analytics aren't unique to our fields. They're the same challenges faced by leaders in any part of any organisation, and maybe this is to be expected.
If your organisation has made it this far, well done. You've built a data function, you might have even done it efficiently. But in the words of Winston Churchill:
Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.