Reducing a 24-hour ETL process to 43 minutes

From cereal to potato chips, Kellogg’s puts some of the world’s most popular packaged foods on grocery shelves every day. But its supply chain dashboards, powered by Hadoop and SAP Object Data Services, made it impossible for managers to get the fresh data necessary for daily profitability analyses. Hadoop’s batch data ingestion required an ETL process of about 24 hours, and interactive queries were painfully slow.

From batch ETL to real-time insights

Kellogg’s looked to SingleStore to replace Hadoop for improved speed-to-insight and concurrency. Its priority was twofold: real-time ingestion of data from multiple supply chain applications, and fast SQL capabilities to accelerate queries made through Kellogg’s Tableau data visualization platform. Deploying SingleStore on AWS, with SingleStore Pipelines, Kellogg’s was able to continuously ingest data from AWS S3 buckets and Apache Kafka for up-to-date analysis and reporting. SingleStore reduced the ETL process from 24 hours to an average 43 minutes — a 30x improvement. The team then was able to bring in three years of archived data into SingleStore without increasing ETL timeframes, and subsequently was able to incorporate external data sources such as Twitter, as well.

Direct integration with Tableau

Finally, Kellogg’s was able to analyze customer logistics and profitability data daily, running Tableau visualizations directly on top of SingleStore rather than on a data extract. With SingleStore directly integrated with Tableau, Kellogg has maintained an 80x improvement on analytics performance while supporting hundreds of concurrent business users.

Source: SingleStore 


Also in need for a data acceleration solution to boost your data? Want to find out how we help you optimize your data architecture for speed and agility?