Chord Commerce
Data Engineer
Led Fivetran Migration Saving 60%+ on Data Costs
Built in-house data extraction layer replacing Fivetran for a multi-tenant e-commerce platform. Developed connectors for Shopify (GraphQL), Facebook Ads, Google Ads, BigQuery, and Klaviyo serving 20+ tenants. Implemented proactive rate limiting using Shopify's throttle budget API.
I like making the pipes that keep the city humming without praise.
The Silent Error Hunt
Data was missing from Snowflake, but no errors in the logs. The jobs completed "successfully." After investigation, discovered our validation hook was catching JSON parsing errors and swallowing them silently. Shopify was returning HTML error pages on rate limits, and our code was treating parse failures as "no data."
Silent failures are worse than loud failures.
Throttling Evolution: Reactive to Dynamic
Four iterations over 34 days. Started catching 429 errors, then exponential backoff, then fixed waits, finally dynamic calculation using Shopify's restore rate: points_needed / restore_rate * buffer, clamped to 1-30 seconds.
The final version uses information that was always available—I just didn't think to use it until I understood the problem deeply.
Multi-Store Architecture
Built support for tenants with multiple Shopify storefronts. get_all_shopify_assets() dynamically generates Dagster assets per store with isolated credentials from AWS Secrets Manager.
Multi-tenant data isolation is harder than it looks—every piece of logic needs to be templatized.
Eventual Consistency: When Missing Data Isn't Missing
Data quality checks flagged orders missing line items. Initial assumption: ingestion bug. After digging through Shopify docs, discovered their eventual consistency model—related objects might not be queryable immediately after the parent.
Adjusted quality checks to be less sensitive rather than alerting on transient inconsistencies.