The company I work for, Teradata, believes in the power of data. We believe that data, in the right competent hands with the right set of tools, and managed by a proper process, will give tremendous advantages to the organisation which owns and analyses it. That is the reason why Teradata has products, services, methodologies, and partners that revolve around data and analytics.
One of the services / methodologies that Teradata has is RACE: Rapid Analytics Consulting Engagement.
RACE is a short-term analytics engagement whose sole purpose is to help answer questions an organisation can’t answer on its own. The inability may be because of the lack of or insufficient tools, skills or resources, data, processes, or the combination of those.
A RACE typically lasts 5-6 weeks. It is never a long term multi-year engagement where the activities are mostly around setting up, configuring, or integrating data processes, tools, or platforms for production-level uses. It is not a “hackathon” either; Teradata is there to also help the customer articulate their problems and turn them into use cases to solve.
Technically speaking, a RACE consists of 3 phases that are executed sequentially: Align-Create-Evaluate. To my experience, it always starts with aligning the Teradata team with the business and technical teams from the customer. It ends with the delivery and discussion of the insights and action recommendations based on the insights.
Learn more about RACE here: https://www.teradata.com/Press-Releases/2016/Teradata-Debuts-Agile-Analytics-Business-Cons
I personally find that it is very interesting to participate in a RACE. Not only do I know that what I do will potentially be very valuable to the customer, but I also know that I will experience and learn many new interesting things about the business itself. In the last 5 years, I have been very fortunate to participate in a few RACEs, and also some engagements that were not exactly a RACE per se — like a 2-weeks analytics engagement or some pre-sales activities.
Once I was part of a small team that discovered a fraud network in a major government bank in South East Asia. Another time, my Teradata colleagues in West Europe helped a pharmaceutical company find potentially fraudulent activities committed by their employees. I was also, recently, a part of a really brilliant RACE team here in the Nordics. We helped a multinational retailer answer these questions,
“If a product is not present on our store shelves, how much do we lose from not being able to sell it? And what can we do about it?”
In this article, I’ll be reflecting on past analytics engagements in general. Most of the analytics engagements I participated in were either great or successful, but some of them were just okay. And, no, I was not a data scientist in those engagements. I was always a data engineer, although many times I went and tried to analyze the data as well.
Reflecting on My Experience
The last RACE engagement that I participated in really made me think,
“What do the successful analytics engagements have in common? How are they different from the ‘just okay’ ones?”
So then I sat down, took a sip of my black coffee, and started jotting down notes from my memory. I started by tracing back from the results, and found these commonalities at the end of the successful analytics engagements:
- It uncovers insights that the customer didn’t know before.
- It solves the problem precisely. It hits the mark and sometimes goes beyond.
- It produces actionable and valuable insights; It is possible to show how much, in dollar terms, they gain or lose.
- It makes the customer come back for more. 🙂
Then I continued by asking myself,
“How do I see how we got there? What did go right? What was it in the process that constituted a great outcome? We need to do more of that. What activities, situations, or anything were counterproductive? We need to have less of that.”
When I started thinking about the answer by reflecting on my experience, I saw that many of the things that the RACE laid down made sense. They also generalize well enough to be applied to many forms of analytics engagements. The first thing I noticed was the division of roles in a RACE team. The second was how those roles went through the phases in a RACE.
I will discuss, in detail, the phases in a RACE in another article. For now let’s take a tour to discuss the four roles for a minute and how they each contribute to RACE’s success.
The Roles in a RACE
A RACE team consists of four different roles: Industry consultant, data scientist, data engineer, and project manager. There’s also the account team that makes sure the RACE team gets the support it needs from Teradata and the customer. I don’t really have a good visibility to what the account team does, but they deserve a recognition when a RACE is successful, too.
The first role I need to point out, when I’m talking about successful RACEs, is the industry consultant. In my experience, having a capable and team-player industry consultant eminently increases the probability of successful delivery.
Why? Well, they ensure that the team is facing the right problem, properly. They clearly define the boundary of what the problem is and what it is not. They put the first pieces in the puzzle and also describe the bigger picture. The success of a RACE is first determined whether the team has the correct big picture and the precise starting point.
For that to be the case, industry consultants need to have wide and deep knowledge, and experience, of the industry. The knowledge will let them think and discuss in the context and language that resonates best with the customer. The knowledge will also help them ask precise and constructive questions, that will lead to a well defined and articulated problems and solutions.
In a RACE, the industry consultants usually get involved in every step of the process. They even engage the customer early, when the RACE hasn’t officially started. We’ll discuss about this in my next article.
In successful RACEs, the industry consultants work tightly with the data scientists, in a mutually supportive manner. During the Align phase of RACE, industry consultants and data scientists slice the problem into manageable pieces. They define the context, boundary, and directions. This is done so that the data scientists know where to effectively use their energy and time during the Create phase. In the Evaluate phase, industry consultants and data scientists will deliver the deliveries and discuss the insights.
Let’s move on to the data scientists. I see that effective data scientists, that I’ve worked with, are the ones who can reconstruct a problem in a scientific way. From a given use case, they come up with relevant hypotheses, and try to test them one by one. For that to be the case, scientific thinking, statistics, and coding are the minimum requirements.
Not only that, data scientists are the center puzzles which connect to different puzzles. I did mention that data scientists need to work closely with industry consultants — that is only one of the puzzles.
They also need to work closely with data engineers to make sure the team has the data needed. They, together with the data engineers, work with the customer’s technical team to understand data caveats. Last but not least, data scientists work with the commercial / business teams, together with industry consultants, to be able to translate and represent commercial processes and business rules with data.
In order for this to be possible, apart from statistics, coding and some business understanding, it is of great value if a data scientist is able to articulate and present problems and solutions to highly technical persons as well as business people with little technical understanding.
I mentioned that data scientists and industry consultants need to work tightly. What if the roles work well together? Well, I’ve been in an engagement which the industry consultants and data scientists were not really connected. They even sometimes clashed, didn’t resolve their discussions or make mutually agreed decisions in a timely fashion. The result of that was suboptimal.
It wasn’t a pleasant experience.
I’ve also been in some other analytics engagements where we didn’t have an industry consultant. In this kind of situations, it’s going to be a team effort with the data scientists under the spotlight. In some engagements, especially when the team members didn’t have the deep experience with the industry and the time that our customer invested in the engagement is minimal, it felt like we were running in circles. We came up with ideas and spent time pursuing them, but nothing really hit the mark.
In other analytics engagements, we just killed it. I think, the factor could be that the team’s industry familiarity, or that my data scientists were just extraordinary. It was like, we were in the dark but the data scientists just knew where to run. They could see the way. I could not elaborate it but I could feel it since we worked closely. Maybe it all came from their experience or was just their remarkable intellectual capacity.
In short, I’m pretty impressed by the brilliant data scientists here in Teradata.
So that was it about industry consultants and data scientists. You may now think,
“What about you, Febiyan? What do data engineers contribute to the success of analytics engagements?”
Well, what I need to do is simply to press a button and then everything just works. Then I just need to chill, drink my coffee, smoke some cigarettes, trim my nails, and everybody is happy.
Just kidding. And, no, I don’t smoke. So, what about data engineers? What kind of peculiar creatures are they? How do they work in a RACE?
Note: I may be very biased here. It is also hard for me to articulate how I, a data engineer, contributed to less successful RACEs. That’s my blind spot. I wish someone could point it out to me. However, I’m going to try my best to stay objective, based on the feedback I received.
Data engineers do most of the dirty technical work. In RACE engagements, where the focus is more of coming up with discoveries than building the perfect platform, they need to be able to work fast and make precise educated guesses. They need to be able to respond well to chaotic situations that likely occur in every engagement.
Early in the engagement, they gather the technical requirements, mostly about the data and what the RACE team is going to do with it. They need to know the size of the data sets, how they are formatted and structured, how they are related, and all that.
In most of my engagements, I deal with data sets ranging from 100 GBs to a few terabytes. The size relatively small compared to the ones I saw in some Teradata Datawarehouse / Hadoop customers, but it may not be small enough to fit a machine. Especially if combining multiple data sets and running various algorithms are needed. In that case, we in Teradata use our own tool: Teradata Aster. It crunches terabytes of data like a champ.
Note: Teradata Aster capabilities will be merged into Teradata Analytics Platform in the near future.
With the requirements in hand, they estimate the capacity and capability needed to process that data, select the tools that fit the tasks the most and easiest to set up. Then they set them up. In my last engagement, I had a Teradata Solution Architect take care of the this.
Once the analytics environment is set up, the data engineers proceed with getting the data sets, load them to the target machine, and make sure the data scientists can consume it right away. For that to be smooth, the data engineers need to work tightly with the customer’s data owner.
Well, from my experience, the data loading and preprocessing phases are usually be rocky. The data sets may have missing required values, wrong time range, or are slow to extract or unzip. There are cases when the given data sets were incorrect, too. That’s the challenge. The data engineers need to do the data loading right, fast. Redoing the tasks can be costly, especially when dealing with terabytes of data.
The important to note here is that, within the tight timeframe, they will need to able to achieve this all. There will be dark magic here and there. There will be patches and duct tapes. Their offices may also have depleted their yearly stock of coffee at that point.
Once the data loading is done, they will share the technical know-how: on how to use the platform, how to access the data, what not to do, and all the things that need to be told. The data engineers also need to make sure all tools are green until the end of the RACE. They also need to proactively support the analysis tasks in the Create phase and promptly respond to help requests for technical assistance.
Last but not least, let’s talk about project managers.
Project managers orchestrate the team members and make sure they are progressing towards the goal within the agreed timeframe. They know what every team member is doing. They need to shield the team members from unnecessary problems. They take the damn pain of dealing with people when things don’t go as planned.
In every RACE, no, in every project, there will be at least a surprising problem to be solved by the team members. The supplied data may be incomplete, corrupted, or at least unusable. Sometimes the loading is very slow. Sometimes it is some miscommunications by the team members. Sometimes the customers try to sneak in a few things that is not in the agenda.
In some cases, when problems are not handled properly and in a timely manner, a project may get chaotic really fast. It can be overwhelming when the situation becomes chaotic.
What do people do when they are overwhelmed? They will shorten their thinking timeframe. When you’re confronting a dragon, for example, you probably won’t be thinking about what you will cook tomorrow. When you’re very sick, all your senses focus to cope with all the pain in your body in that very second. The distant future suddenly becomes not so visible, because the now is overwhelming.
Also when that happens, people are prone to get irritated. And, maybe as a consequence of those, people make wrong decisions. In project situations, that is less than desirable. A project manager puts an order in that chaos.
The project manager in my last project engagement did a really great job. She did exactly what I wrote in the first paragraph about project managers. She also made sure that I progressed well and was doing well. I was very comfortable to discuss the problems I was facing with her, even when the problems put some risks in the project schedule. I knew she would take the uncomfortable pain, and help me solve it instead of making it worse. I knew she did the same with other team members.
We were a team. A successful team at that.
In other engagements which I had no project managers, the experience was very draining. In the engagement where the data scientists and industry consultants didn’t really connect, it would’ve been nice if we had a project manager. I tried to take on that role but I was ineffective. In another engagement, it was not unusual to be interrupted by customer requests. Project managers make sure those risks are managed.
That was it about the roles in analytics engagements. I hope I discussed it clear enough and you can learn from it. In the next articles, hopefully in the coming month, I will discuss about the processes or phases in a RACE. I will also touch upon the importance of customer involvement.
What I wrote may or may not reflect my employer’s view. I take full responsibility of what I wrote. I would also personally thank my colleagues which provided me with valuable input and helped me make this article much better: Nadine Anouti and Henrik Atteryd. You guys are awesome.
As this is a highly subjective and probably biased piece, I also invite you to leave some comments about what you think or what you experienced — what is beneficial or detrimental to the success of analytics engagements?