The Mistake You are Making with Your Data Team

All Articles Culture Data Management Level 12 News Python Software Development Testing

The Mistake You are Making with Your Data Team

One of the biggest mistakes that people are making with their data team is a conceptual issue: expecting the right thing from the wrong people.

What I see over and over again inside companies are very smart, very helpful employees spending hours of every day and every week struggling to wrangle messy data.

They are trying to be data driven, so kudos to them there.

But the issue is that these companies are trying to make their business leaders and data scientists function as data engineers, and data engineers function as analysts.

The mismatching of job roles and responsibilities make it difficult for good team to function well. There are lots of job titles and roles in the data world, but let’s boil them down to 3 key players:

  1. Data engineer
  2. Data scientist
  3. Analyst

Know Your Data Role

A data engineer is the person that handles the gathering, cleaning, structuring, and delivery of your data. This is the person that will go out and extract data from databases or enterprise software. They care about APIs and webhooks, database architecture, data pipelines, munging, and generally getting the right data from point A to point B.

Data scientists are the people building and using advanced analytics tools such as Machine Learning, Artificial Intelligence, or Natural Language Processing. They are less concerned with cleaning the data and more concerned about building advanced tools to analyze, often very large, data sets. They are part mathematician, part software developer, and part analyst. Their goal is uncover deep insights inside massive, many times unstructured, data sets that are near impossible to uncover manually.

Finally, the analyst is the person who identifies areas in the organization that needs analysis, thinks through the questions they need to answer, and work with analytics and Business Intelligence tools to report on their data sets. Again, these people are less concerned about how the data comes to them, and more concerned about analyzing the data to influence business decisions.

What often happens in companies, large and small, are that you have the blending of these roles with one another, or with other roles in the business. This is done out of necessity sometimes as organizations don’t have a robust data team, or are waiting on an already overloaded IT team to deliver. But in the long run having the wrong people doing the data work produces inefficiencies and ineffectiveness.

IRL Data Problems

For instance, it is very common for data scientists to spend up to 80% of their time building the data pipelines and cleaning the data they need for their projects. These people have likely received some training on database infrastructure and management as well as web development, but that is not what they are being paid to do at the end of the day, and they are not the most efficient in that realm either.

Likewise businesses both large and small ask their senior leaders to play both the role of the analyst and the engineer. These people are going to several software platforms on a daily basis to export a CSV or Excel spreadsheet so they can start crunching numbers for a few hours. They work hard at correcting errors and standardizing the data, but at the end of the day they have people to manage and mission critical decisions to make based. They need the data, but their time and effort is more impactful with tasks other than data wrangling. You can take a look at example case studies here.

If you have engineers on your team, they are often given the difficult task of being the one-stop-data-shop. You want to lead with data, so the obvious answer is the hire a developer and have them do the data stuff for you and your team. The issue here is that they are often not well suited to be the analysts or scientists. They handle the flow of data, but do not necessarily know which parts of your business are the most important and may not be the right people to guide your business decisions. The business need will dictate how the data is structured and managed, but the engineer doesn’t have the view from the top that the analyst should have.

Take a few moments and think through the workflow the people managing your data experience. Are they playing roles that can be re-assigned elsewhere? These are hidden opportunities for increased productivity and effectiveness with your data.

If you or someone on your team needs the help of data engineers, contact RapidBI. We offer fractional data services so your scientists can get back to creating the next world changing innovations, and your analysts can get back to growing your company.

Originally published on 2021-02-05 by Royce Hall

Reach out to us to discuss your complex deployment needs (or to chat about Star Trek)