Petabyte Data Services is a data consulting services firm that provides scalable data engineering and analytics services for our clients

Petabyte Data Services is led by a team of ex-FAANG engineers, data scientists, and business intelligence analysts with a combined track record of 30+ years of experience

Past Projects

  • Designed a real time data warehouse leveraging various AWS services: Redshift, Lambda, Kinesis, and SQS

  • Architected ETL data pipeline workflows that processed 500TB per day of events for reporting purposes

  • Led a data deletion initiative to conform with GDPR and CCPA regulations

  • Architected a real time ML driven pricing system to support price movement with an average latency of under 500ms

  • Implemented a real time in app analytics tracking framework for tracking user page views and clicks

  • Introduced Kinesis data streams to synchronize data in real time across multiple data stores: Redshift, DynamoDB, and S3

  • Built an automated data quality check mechanism to detect ETL workflow outages using AWS Glue

  • Built engagement reporting dashboards that aggregated metrics - e.g. DAU, MAU, churn rate - using Tableau

  • Researched and deployed a random forest classifier algorithm that ranks leads by probability of conversion and identifies feature variables that contribute to higher conversion, leading to an increase in conversion rate by 2%

Testimonials

We highly recommend working with Steve. He helped us automate processes that save us hours of time per day and scales our business. He was engaged and thoughtful about scoping the project. The execution was diligent and thorough. And he worked with us after the engagement to ensure everything was going smoothly.