Group 1001 Transforms ‘Old School’ Insurance Industry with Data

A data stack built with Coalesce helps demystify insurance and annuities for customers

Group 1001
Zionsville, IN
Top Results:
terabytes (TB)
of uncompressed data migrated to Snowflake in less than a year with a team of 5
in data engineering team productivity
to go from “idea to insight” compared to 3 months with previous solution

“We chose Coalesce because of the fact that using the automation, using the templates, using the very streamlined way of development makes the iteration cycles much faster. It’s very easy to troubleshoot, to apply changes and refactor the models as a result of Coalesce’s strong lineage tracking capabilities.”

Gu Xie
Head of Data Engineering & Data Analytics, Group 1001

In an industry notorious for being slow to modernize, Gu Xie is on a mission to innovate, fast. Gu leads a team of four at insurance holding company Group 1001, focusing on the annuities side of the business.

“The insurance industry is old school, on-prem, slow to react,” says Gu. “Because Group 1001 is a technology-first, forward-thinking organization, we’re trying to foster innovation and bring new modernization approaches to the insurance industry to fundamentally transform it.”

That transformation is off to a trailblazing start. By modernizing Group 1001’s data stack with Snowflake, Fivetran, and Coalesce, Gu says his team’s productivity is up 10x, now able to go from idea to insight in just two days, compared to as long as three months with the company’s old legacy solutions.

Limited visibility


Challenging to get fundamental data pipelines up and running reliably
With no data modeling in place, no way to create a unified view of all assets across the business
No insight into the sales funnel for the sale of new annuity policies to customers

When Gu first joined the company a year ago, one of the biggest challenges he encountered was getting fundamental data pipelines up and running quickly. “We had long-running cycle issues, we had cycles being delayed and data not delivered to the business. Or in some cases where the cycles did finish, there were silent failures with the processes, resulting in the business reporting on data that was incomplete or partially complete—which is, in my opinion, the worst outcome.”

The original data stack was neither efficient nor reliable. “We had Airflow processes, we would take data using Python to extract them from the APIs, land them to a Postgres data store, and then build our Power BI reports off of that,” Gu recalls. “The problem was that it often took five hours to load those processes, and there was a fundamental lack of version control across each of those steps. As a result, whenever issues occured—and they inevitably would—it took seven hours to repair, rerun, and correct.” This essentially ate up the business user’s entire workday.

Gu and his team work closely on the actual policy management system for the annuities business across all Group 1001 brands. His main initial goal was to create a unified, top-down view showing the total assets under management by the annuities business in the Group 1001 organization, broken out by company, by product line, and by customer. But this proved impossible given that there was no data modeling in place. “All reports were being run by ad hoc views that were being vetted for each of these data sets,” he explains. “There was no modeling to create a standard dimensional effect layer to report all of the assets within the organization.”

Another important aspect of the business is the operations side of provisioning the policies sold to customers, which involves heavily regulated paperwork: collecting documents, getting signatures, conducting the KYC/AML analysis, and more. But according to Gu, the company had no visibility into this process from the start of the sales funnel through to the end. “When we first started a new project to help address this issue, there was an expectation on SLAs for policy issuance,” says Gu. “What we discovered is that it actually took over 3 times longer on average.”

From a business operations standpoint, with no insight into the backlog, the company couldn’t staff properly to accommodate the influx of policies. “You can imagine if I’m a customer finding out that my policy took longer than expected to issue, I’m not going to be very happy,” he says. “Not having the right data to address those concerns was ultimately impacting customer satisfaction and preventing us from scaling to the demand for annuity policies.”

A matter of policy


Built a POC data model with Coalesce to give a holistic picture of all the organization’s data
Developed new data stack built around Snowflake, Coalesce, and Fivetran
Shifted to a fully automated solution removing manual processes and guesswork from the sales, operations, and marketing cycles

Gu’s team interfaces primarily with two large financial firms under the Group 1001 umbrella: one sells policies to consumers via financial advisors, and the other uses a direct-to-consumer model in which customers create their own annuities policies directly from the website. “We needed to create a unified view of all policies within the organization, all of the transactions, customers, and balances across the board for the products that we serve within the organization,” he says. “That’s why we kicked off the POC with Coalesce, in order to create a model to address that need.”

In his previous organization, Gu had rolled out a code-first data transformation solution, but onboarding and training the developers to use it had been a huge lift. “I simply didn’t want to do that anymore. That’s why we chose Coalesce, for the fact that I didn’t need a large team of data engineers, data modelers, or data analysts to build these data models from scratch—I just needed one person.” In a few weeks, the team had already built the entire data model, creating a holistic picture across all of the organization’s data. “Now we can proceed with operationalizing this end-to-end, and start transitioning and rebuilding our reports on top of this new unified data model.”

Coffee with Coalesce:
Listen to our podcast interview with Gu Xie, who shares the secret to building highly performant data engineering teams.

Today Gu’s team has fully redesigned the data stack, which is built around Snowflake as its core data repository: “We’ve completely rewritten all the fundamental processes and the platform from scratch,” he says. “We have Snowflake as our data store, we have Fivetran replicating our pipelines over to Snowflake, and we’re using Coalesce for data transformation.” A number of other tools are in the mix as well, including Soda for data quality, Atlan for the data catalog, and Dagster for orchestration.

“We chose Coalesce because of the fact that using the automation, using the templates, using the very streamlined way of development makes the iteration cycles much faster,” Gu said. “It’s very easy to troubleshoot, to apply changes and refactor the models as a result of Coalesce’s strong lineage tracking capabilities.”

A positive transformation


Migrated all reporting pipelines over to Snowflake in less than a year with a team of five
Automated all deployments, migrated a large number of reports and DAGS, and replicated 66 databases onto Snowflake
Reduced “idea to insight” iteration cycle from three months to just two days

With this new data architecture in place, Gu’s team has been able to make an enormous amount of progress in a very short time: “In one year we onboarded Snowflake, automated all the deployments, migrated all of our Power BI reports from Postgres to Snowflake, migrated over 160 reports, 200 different DAGS over from Airflow to Dagster, replicating 66 databases from Postgres onto Snowflake,” he says. “Nearly 4,000 table feeds were being loaded every single day. Almost four terabytes of data, uncompressed, has been migrated to Snowflake to date—in less than a year with just a team of five, including myself.”

Gu estimates that had he attempted to take a code-first approach or used legacy transformation solutions, he would have needed a team 5x the size and it would have taken twice as long.

“Overall, I’ve always found Coalesce to be about 10 times more productive,” he says. “It was very easy to add columns, introduce columns, refactor some of the tables, redeploy, rerun. Coalesce makes it a lot simpler and faster for even making one change cycle. Part of it is column-level lineage, and part of it is the fact you have highly templated development from a standards perspective. You can have a more nimble, lightweight team to do the same amount of work because you’re more productive, while at the same time maintaining higher levels of standards.”

All of these changes have introduced a fundamental positive shift in the company. “About a month ago we had an initiative to optimize the productivity organization,” Gu recalls, “so we sat down with the users and identified the key outcomes and business processes, got the data points, created the data models, and deployed the insight. It took about two days from idea to insight—this is unheard of in our organization. This is something that usually takes three months, so to reduce this to just two days is a whole new paradigm.”

As far as his future plans, Gu’s goal is to disseminate the platform services his team has built to many different groups across the organization. His hope is this will allow them to become more data-centric without the need to rebuild the end-to-end platform and integrations from scratch: “Now other groups within the Group 1001 organization can start using what we’ve developed to build their own data models.”

Explore next-gen data transformations for yourself

Experience the power of Coalesce with a 30-day trial