Data Platform Masterclass
Do you want to understand the technology behind your data?
Find out more in our
Data Platform masterclass.
The Data Team | Outline
I’ll quickly summarise the points made in this post, but read on for the full guide to building a data team.
You need to put together the following roles in your data team:
- a Head of Data, who defines a great vision and sells the team internally
- a Delivery Manager, to keep everyone on task and communication lines open
- a Solution Lead, to come up with customer-focused solutions
- a Framework Architect, to design great technical solutions that are flexible and supportable
- Software engineers, who want to own their work who build out the frameworks
- Data engineers, who are going to deliver the data solutions to customers
There are more skill sets that go into a bigger team, but they’ll come later. You need to adapt the team to your circumstances and focus on the solution lead and framework architect to start with.
It goes without saying that defining leadership in this space is important. What I don’t want to talk about here is the kind of data leadership a Chief Data Officer represents. I’m much more interested in the on-the-ground men and women doing the delivery.
The Data Team | Head of Data
First up, we have the head of data. From my experience, this person needs to focus on three things:
- Setting the vision for the data team and the platform
- Selling that vision internally
- Unblocking their team members
Having a great vision for your ambitions is crucial; it’s your tool to both motivate and inspire your team, and it’s how you get buy-in from the rest of your organisation. The business is going to have plenty of project ideas, and if you can use sophisticated technology and smart design to achieve these, all the better.
Take a common example of targeted offers. If you need to run a batch job allocating targeted offers, then turn that into a real-time next best offer powered by machine learning. Sometimes, a huge improvement like that isn’t feasible and that’s quite an extreme example. But it works on a smaller scale too.
If someone has asked for you to load and transform some data, then provide them with automated alerts in the event of an error and a dashboard to track the data quality with. It’s not about the head of data coming up with these kinds of ideas themselves, it’s about inspiring the rest of the team to do it.
Selling the team internally is often overlooked. You think because you’ve got the money and people in place you’re done. Think again. There needs to be a constant push from the head of data in all directions about what the team is doing, what they’re going to do, what they’re capable of and so on. Make sure when people think data, they think your team.
Finally, the head of data must be unblocking the developers in the team. So much time gets wasted in arguments over whether they can have an account to a source system or what the requirements for a project are. You need to make sure that the teams time is utilised to maximum efficiency and the best way to do that is to make sure they’re not stuck on what they need to work on.
The Data Team | Delivery Manager
The delivery manager is a hybrid project manager/scrum master/product owner for the team. It’s their job to make sure everyone is on task and that communications are flowing in and out of the team properly. Their key tasks are as follows:
- Manage the backlog of work for the team and allocate it out
- Manage customer relationships and communicate out to them anything that they need to know
- Lead stand-ups and sprint retrospectives
In contrast to a lot of the other roles on here, the delivery manager has the tough job of making people commit to a plan, and get work out on time. That makes it more of a governance role than a creative one, so recruiting previous project managers or scrum masters should help you pin down the right candidate.
Managing the incoming requests and priorities of items in the backlog is vital, as well as organising the various scrums and sprints – it’s a rhythm that the delivery manager needs to establish early, and not let drop. If they go on holiday for a week, it should be muscle memory for the rest of the team to scrum in their absence.
Beyond the day to day management side of things, they need to be a strong leader that can rally people to their cause and bring customers on side. They are going to be one of the most visible people in the team so need to be comfortable managing customers. A strong delivery lead can make all the difference between being a well-oiled machine that’s hitting deadlines, and a group of well-intentioned talent that gets nothing done.
The Data Team | Solution Lead
The solution leads job is very simple (to define at least!). Own the overarching design of the platform, from data ingestion right through to data provision. Their key focus should be to:
- Understand what the customers of the platform need.
- Maintain a coherent approach for everything going on in the platform.
- Be an emissary to customers, and a cheerleader for the platform.
- Make quick decisions on how things will be done.
You may think that this person needs to be very technical, but they don’t. While a deep understanding of data is important, being able to model and transform data, they also need to understand their customers. They need to be able to take the questions that they are being asked, and understand what is actually required. They should be able to come up with simple and elegant solutions to these questions, and communicate the solution to the rest of the team.
The Solution Lead will also be the go-to person for questions about the platform, and one of the primary representatives out to the community, so they need to have good people skills to build a rapport with everyone.
Finally, they are going to need to be able to make decisions fast so that the pace of delivery is not slowed down. The development team will have a lot of demands on this role, especially at the start, and they all need to be turned around quickly.
The Data Team | Framework Architect
The framework architect is your technical lead. They should have a deep level of understanding in the tool, technologies and languages you are using (more on that coming up in the series). Their key responsibilities are:
- Design the technical frameworks for ingestion, provision, logging, etc.
- Validate work done by other engineers conforms to the spec.
- Keep an eye on the future.
This is probably going to be the role you have the most difficulty hiring – finding a good framework architect is hard. Hiring a great one can be like climbing Everest.
They need to be an experienced software engineer, someone who has extensive experience doing the delivery work. Someone who is always looking at what the industry is doing, including which new and emerging technologies are coming along, and who has a keen interest in trying new things.
An eye for detail is also crucial; not only in how they implement their designs, but also in reviewing the rest of the work in the team, as they are one of the primary touch points before code goes into production.
The frameworks form the basis for a lot of the delivery you do and getting them right will mean you have a flexible, easily supportable and quickly deployable base from which to work. We’ll be going into a lot more detail on these in some of the later posts.
The Data Team | Platform Architect
The platform architect is someone you’re going to need to get the project started as they will put together the design of the infrastructure and technology you’re going to deliver on. As you’re likely operating in quite a lean model, they’ll probably pick up a hands-on role implementing and configuring the platform too. Their key responsibilities are:
- Design the high-level architecture for your system.
- Install and configure the platform.
- Plan for the integration of new technologies.
The platform architect needs to be thinking about how the system will look in year 1, year 2, and so on. Do you need to set-up encryption from the start? How can you scale the platform over time? What do the development and test environments look like? These are all questions that the platform architect needs to answer.
In a larger team, you’re likely to want to introduce an admin who will take care of a lot of the configuration and upgrade tasks. However, this is often a luxury you can’t afford, so it’s import your platform architect can pick up both roles to start with (and is comfortable getting hands on).
Finally, they need to be looking ahead to the future and how the tech your using is developing, so as not to back you into a technical cul-de-sac which requires a complete rebuild at some point.
The Data Team | Software Engineer
These are the developers who are writing code, and you’ll scale out the number of them as the team’s work grows. While finding people experienced in the tools and technologies you are working with can be helpful, I’ve worked with some great people who’ve learnt on the job, and ended up being some of the most productive team members.
If you’ve fostered a culture of inclusivity amongst your team and hired the right sort of people, then this is a totally valid approach for some of your hires, but not all of them. Day-to-day, the software engineers are going to be implementing the frameworks designed by our architects; doing the initial builds, and then applying it to many different use cases.
Hire people who feel responsible for the work they do – the people who have an innate sense of ownership about what they are doing, and who are dedicated to making is better than passable. If you can find these people (and they do exist), you’ll always be on top of your delivery.
The Data Team | Data Engineer
Much of the above remains true to the data engineer, certainly when it comes to attitude. Instead of working on the frameworks they’re going to be spending their time getting hands-on with the data; modelling it, testing it, transforming it, everything you’d expect.
As such, it’s important that they have a strong background in working with data. Similar to the software engineer, you can train people up in SQL and other techniques if you’ve got the right people and culture in your team, but you might want a couple of experts to begin with.
Much like the solution lead, they need to have a strong focus on what the customer wants, and be able to work closely with them to define solutions.
The Data Team | Everyone else
There are quite a few roles I haven’t mentioned; operations, administrators, testers, and business analysts. As your team grows, these are the kind of people that you are going to want to bring in. Testers, in particular, are a role you should try and prioritise as soon as you can.
In many organisations, the administration is done by other infra teams, so it’s not something you need to worry about. That being said, my preference has always been to own as much as possible of your environment, development, and direction within the team, as it helps remove blockers and keeps you agile.
Rely on external support as little as possible to get things done – you don’t want to get bogged down in someone else poor process or planning.
Your solution lead and data engineers should be able to double as business analysts for what you need here. Likewise, you’ll find that projects come to you for part of delivery already with their own BA’s, so there are more important people that you can focus this headcount on, in my opinion.
The Data Team | Adapting to your circumstances
Depending on where you are in your journey, how much budget and headcount you have, and how much work there is, you’re going to have to customise and adapt what’s outlined here to suit your ambitions.
While working with M&S, I started with a four-man team, that had grown to thirty strong by the time I left. If you’re a small team, you need to focus on the solution lead and framework architect roles. Your data teams are going to double up as developers; one focused on software and one focused on data. The head of the team can fill in for the delivery lead to start.
As you expand, you need to balance the software engineering and data engineers. In the long term, you’ll likely want more data engineers overall, though this will be driven somewhat by the types of use cases that you’ve decided to tackle.
When work hits a critical mass, you must bring in your delivery lead. By the time you’ve got 4 or 5 development resources, you should be aiming to have this role in place. From there, you just scale with demand.
Contractor or consultancy resources are a good way to deal with short-term demand or tackle a particular technical challenge, but you want to focus on owning as much of the design and knowledge in-house as you can.
This article is the first in a series of posts on Building an Analytics Platform by Cynozure CTO James Lupton; make sure to read up on the other four articles for more insight.