REAL-WORLD SUCCESS

Optimizing Data Pipelines: A Case Study on Consolidating Data Sources and Formats

Introduction

The client is a long-established IOT company that stores a vast amount of unstructured  historical data from vehicle sensors. Their team did not possess a general knowledge of modern data practices and came to StandardData to explore the potential buried in their data.

Problem/Goal

The client experienced significant financial and opportunistic costs associated with their aging data pipelines and storage infrastructure. They asked that StandardData investigate avenues by which storage cost and query times can be reduced while preparing for a predictive ML model.

Solution

StandardData's investigation revealed that the way time-series data (~3TB) was being stored made  most queries expensive. By leveraging big data solutions hosted with their cloud provider, StandardData was able to build these processes to run in minutes instead of hours.
 
StandardData developed the deliverable from scratch to specification in 2 weeks.

Results

50% reduction in storage size
99% reduction in query time
70% decrease in technology costs
Identified 4 safety faults through VINs, initiating a proactive recall