A Better Approach to Data Science Helps Paytronix Offer Better Guest Engagement

When big restaurant brands want to boost customer loyalty, they rely on a guest engagement platform powered by Coalesce

Company:
Paytronix
HQ:
Newton, MA
Industry:
Restaurant & Retail marketing software
Employees:
300
Stack:
Coalesce
Snowflake
Top Results:
1
month
for 2 new employees to complete transformation project that entire team had previously worked on for 6 months with little progress
50%
reduction
in costs by moving last-mile data transformations to Coalesce
2-3
days
to build data pipelines rather than weeks

“Snowflake and Coalesce have allowed us to get to real-time modeling. We are now able to execute data science projects on a drastically different scale and our whole business model, in terms of what we are able to produce and the value we are able to deliver, has shifted.”

Jesse Marshall
Director of Data Science, Paytronix

Taking a “Moneyball” approach to data science

Challenges

Team was unable to take meaningful action on siloed data coming from many sources
Patchwork of tools made it difficult to build pipelines quickly and easily
No ability to truly experiment (fail often and fail fast), which is necessary for data science

As the Director of Data Science at Paytronix, Jesse Marshall leads a team of seven, which includes two data scientists and five data engineers. The team works closely with the larger strategy and analytics (S&A) team, which provides clients with insights they need to truly engage with guests. That’s why, explains Marshall, his own team’s No. 1 priority is to make data actionable for other departments: “In this day and age, there’s no shortage of data—but to have data and to have data that is actionable are two very different things.”

Marshall’s team was challenged with collecting, organizing, and deriving insight from data coming from a multitude of sources, running on multiple databases, and in disparate formats. “We see data in so many different forms so we have to be flexible with how we can ingest the data,” he says. “All that data comes into Snowflake, and if data is not actionable, it’s not very valuable.” According to Marshall, getting data into Snowflake was the easy part, since it is so flexible in regards to data types. But the team still had the issue of how to make that data actionable. “When we’re doing an ETL, how do we make it so that it’s not a weeks-long project to get the data into the different departments’ hands?”

Watch on-demand webinar:
How Paytronix uses Coalesce to eliminate data delays, inefficiencies, and the high costs of performing data transformations in Looker.

Back then, the company was using a mix of Scala and PySpark jobs for data transformation—custom code, hand written. This was a great structure for analytics at the time, but it became clear that it was not able to keep up with the growing demands of the business. Marshall wanted to get ahead of the game and be ready for increased sales and customer demand. The increasing scale was putting pressure on the platform, and a lot of time was dedicated to maintenance and break-fix support. In addition, the technology had such a long learning curve that only a small number of people had enough knowledge to be able to work on it.

“I was frustrated with how long it took from requesting a certain table or a pipeline to it getting built. I wanted to completely rethink that process,” Marshall says. “Every time you had a change, it was like starting from scratch with the pipeline again. Say you had four big projects a year, and those four projects all had to be good ideas that you fully delivered on. If one of those four projects didn’t work out for whatever reason, it was a really big hit to your overall contribution.”

Marshall knew this type of approach didn’t work well when it came to data science projects, which usually began with just an idea and a rough set of requirements. “When you start a [data science] project, you never start with 100% clear requirements,” he explains. “As the idea becomes clear, as you learn things and train the models, you need to make tweaks to the pipeline to add some features and take some away. You’re trying a lot of things; some will work and some won’t.”

This was a completely different approach than the company took to maintaining its core business platform, which needed to work without interruption and where there was no room for experimentation. But Marshall believed the data science side of the business should be more R&D, and he wanted to completely change the dynamic so there would be little effort required for his team to come up with a pipeline or an idea, get things to a proof-of-concept phase right away, and then test it quickly. Or, as he puts it, he wanted to take a “Moneyball” approach: “With the data team, instead of say four ideas (and these numbers are arbitrary), I wanted it to be five times that, so say 20 ideas. And instead of a 25% failure rate, I wanted it to be 50% or 75%. I wanted the team to try a lot of things and fail very quickly.”

Gaining real-time insight into customers

Solution

All data in one place with Snowflake
Faster and smoother time to insights with HVR/Fivetran for ingestion and Coalesce for data transformation
Executing on data science/AI initiatives with near real-time data and the ability to iterate frequently and quickly with Snowflake Snowpark

Marshall began looking for a solution that would enable his data team to operate in the way he envisioned. An industry colleague pointed him toward Coalesce as a good fit for what he was trying to achieve, and he was quickly sold on the platform’s ease of use and flexibility. “Anyone on our team could use it,” he says. “It was really about democratizing the data and the ETL process—opening it up to everyone and not keeping it for the select few.”

The team had recently begun using Snowflake Snowpark with the goal of eventually replacing the disparate systems the team used for their data science projects. Snowpark enables data scientists to code in languages other than SQL; they don’t have to take data out of Snowflake to run, for example, Python scripts—they can do it directly where the data lives in Snowflake. “We are bringing that data in near real time over to Snowflake (via Fivetran HVR), and then Apache Airflow is triggering our transformations in Coalesce. Then we use Airflow to trigger the models to run in Snowpark,” says Marshall. Coalesce enabling faster, automated data transformations coupled with the benefits of Snowpark means that today Paytronix can do real-time predictive modeling, and at scale.

One way this benefits the business is that Paytronix can now offer its clients real-time information about their customers’ activity. Without easy access to this data, they would not be able to offer a seamless customer experience, and might lose customer trust. “Say you visited Peet’s this morning at 9 a.m. and then you received an email from them at 9:30 a.m. saying, ‘Hey, we haven’t seen you in a while. Come back in!’ You would think that Peet’s doesn’t know you at all because you were just there 30 minutes ago,” explains Marshall. But having access to real-time data allows someone on the Peet’s marketing team to pull the most up-to-date information about their customers when running a campaign.

“Having real-time access to the latest purchase information, we are able to predict the number of visits and menu items this loyalty member is likely to make in the future,” he says. “This allows the marketing team to recognize their visit, and at the same time invite them to a visit challenge that is individually tailored using their menu preferences. This tailored 1:1 messaging creates a strong personal connection with the brand, and we see a much higher overall level of engagement.”

Building a foundation 
for the future

Results

Two new team members able to complete high-profile transformation in one month, whereas before the entire team spent 6 months without much progress
Instead of just a few people being able to understand the code, the entire S&A group can understand what’s going on behind the scenes and have a single source of truth
Team now able to focus on building new features and predictive models for the business

Before adopting Coalesce, Marshall’s team had been working to convert PySpark scripts that were run on EMR to PySpark scripts that could be run in Snowpark, but the project dragged on for six months and they had barely made any headway. “They are complex transformations, and it’s extremely hard to test the old way and to validate data,” he explains. But now, armed with a transformation tool that everyone on his team could get up to speed on quickly, regardless of skill set, Marshall decided to take a new approach. “I started using Coalesce with our two newest team members,” he says. “In the month of December, we took our most high-profile transformations, and I had those two do about 90% of the work. This was a massive project that we got done in one month, largely with two people.”

Marshall was struck by the enormous boost in productivity that this new approach offered his team. “That’s been an outstanding success in my mind—the time saved,” he says. “We’re replacing a process that has been in place since 2014, so that’s a huge win in my opinion. Being able to write your transformations, do your logic, and then run right at the node level is incredible.”

But it wasn’t just Marshall’s smaller team that was able to directly benefit from adopting Coalesce. The analysts on the larger Strategy & Analytics team, frustrated by the time it originally took the overworked engineering team to build pipelines, had gotten in the habit of using the company’s BI tool, which enabled them to create persistent derived tables, as their own ETL tool. But over the years this had led to a lot of confusion around metrics because there were many sources of truth. With Coalesce now in place, says Marshall, “We’ve shortened the length of development time and the length of changes, so you’re not seeing analysts create ETLs, and we’re going back to one source of truth for things.” Because the analysts can understand the code and dive deeply into it to see the data lineage, they have more trust in the data.

For Marshall, all this progress has helped get his team to a place where the important work can now truly begin, and his team can offer real business value to the larger company: “In my mind, the really exciting part is the next phase,” he says. “And that’s building new IP, new features, and new predictive models to help our clients offer the best possible experience to each and every one of their customers.”

Explore next-gen data transformations for yourself

Get Hours of Development Work Done In Minutes