Consider any data analytics project, and you’re going to focus on its success. Before all the cool, headline-grabbing stuff can happen, you need to lay the foundations. You need to consider the technology that you’re going to use. Each part of your data analytics platform needs to work together to create the bigger picture.
Data Analytics Platform | Choosing your technology
There is a myriad of choices available when choosing your core analytics platform. Many have different features and can be bewildering when trying to pick between them. To start, make sure you ask yourself the following:
- What type of use cases will you deliver? This will tell you the type of platform and language required to fulfil your current and future goals. Consider the long-term plan as well. Make sure you don’t invest in a data platform that cannot sustain the future plans of your organisation.
- Who are your users? The skills of your data team need to align with the platform you choose – or the hiring of extra talent would be an extra cost to be considered. If they only know R, then getting a platform that only uses SQL will waste your time and resources.
- How much data do you have? If you have terabytes upon terabytes of data, you’ll need a different platform. Similarly, if your use cases require a lot of streaming or real-time data, then that will impact your choice too. But make sure that whatever you choose links back to your use cases and that you’re not just picking something because it’s trendy.
- Does it meet your security needs? If you need to store a lot of highly sensitive personal data, then you’ll need a platform with top-notch security.
- Can it be automated? Eventually, some automation will be useful for your organisation. Does your chosen platform already come with some automation capabilities or will you need to invest further? And can the platform integrate with your chosen automation tool?
- Is it suitable for operational support? If you’re looking to make your data analytics platform operational, you need to ensure it can perform without fault.
- Open source or proprietary? This depends on your data team’s skills and your overall resources. Open source means you’re not locked to a specific vendor but it can require a lot of resources to build at the start. A hybrid approach is also an option.
- What tech support do you need? Do you need 24/7 support or can you get-by if something breaks and you fix it yourself? This depends on the role your analytics platform plays in your organisation. If it’s supporting a real-time recommendation engine hosted on your website, you’ll likely want rapid support if something goes wrong.
- What do existing customers say? The best way to determine if an analytics platform is right for you is to speak to their existing customers. Try to find ones in the same industry or who are using it for a similar use case. Ask them about their experience using the platform and any challenges they have encountered.
Data Analytics Platform | Components
You cannot do any analysis without data. That requires a well-thought-out ingestion process. It forms the building blocks to your analytics platform. When your analytics platform is increasingly called-on, a solid ingestion process allows you to churn through the requests quickly and effectively. In other words, you can get your analytics done much faster.
Good ingestion requires the right tool. This can be created in-house, or you can buy an off-the-shelf ETL (extract, transform, load) tool. Whatever you decide, your data ingestion needs to do a couple things:
- It must handle data auditing and error handling.
- It needs to be configured simply, but still be extendable.
Tools for monetising data
Ingesting and storing your data is all well and good, but if people can’t get to it then it’s not worth much. You must invest in a range of tools from standard reporting, to raw JDBC access for custom projects. This allows everyone, whatever their skill level, to use the analytics platform and see its value.
Speaking of which, you also need to govern and administer your platform properly. This is where management tools come in.
Getting data out of your platform
It’s not all about ensuring you ingest your data properly: you have to build methods to get it back out too. A consistent approach for all implementations is called for, and usually you can use either the batch method or an API.
Failure to do this means that all the interesting stuff that you’ve done on your platform won’t get shared. It needs to be shared with other systems to be of value. For example, providing recommendations based on customer purchases on an e-commerce site, or dynamically changing the prices of tickets according to demand. APIs are a popular method – especially if you have to do something in real-time.
The factory versus the lab
When setting up your analytics platform, it’s worth knowing about the factory and the lab environment.
In the lab, you can trial your use cases before moving them over to the factory. It’s an area where people can upload their own files to combine with your existing data or create their own tables.
This area should be self-managed and offers freedom for people to innovate. By having a lab environment, you avoid departmental or team silos, but still offer an area for each individual team to test new ideas. All whilst maintaining the same set of standards and practices.
If a use case if proven in the lab, then it can move to the factor. The lab basically allows you to invent and test things first, without spending a lot of time and resources implementing it in the more complex factory.
The factory is the place in your analytics platform where things go live when they are ready to be rolled out across the company. It comprises the raw, base and analytics layer. Think of it as the central data store for your entire analytics function.
- The raw layer is where untransformed data is kept along with its relevant metadata.
- The base layer is where data is cleaned and put together in a sensible model. It’s getting ready to be used. However, I wouldn’t stray too far from the original source data at this point. Organise it, apply certain rules, put it in models, clean it and exclude certain records.
- The analytics layer is about providing people with data in the most usable and best-performing way. It can take the form of big, wide flattened tables that join together all potentially useful information on a topic. Alternatively, it could also be a specific table structure to answer a more complicated business question.
Data Analytics Platform | Make or break
Your data analytics platform is fundamental to getting value from your data. Don’t expect to choose a platform immediately. There are many factors to consider and your choice could make or break your data strategy. Don’t be afraid to seek advice and recommendations, or to test some different options first to see whether they are suitable for your organisation and set-up. You’ll want your analytics platform to work for you for a long time. So take your time when picking a solution.
James Lupton – CTO
Data Platform Masterclass
Do you want to understand the technology behind your data?
Find out more in our
Data Platform masterclass.