Kevin
Kabeya

Information

Kevin Kabeya is a Data Engineer with 5+ years across data analytics and data engineering, progressing from analytics-focused roles into full pipeline ownership. His work at Chord Commerce included building extraction layers that replaced Fivetran (60%+ cost savings), implementing proactive rate limiting for Shopify GraphQL APIs, and debugging production data quality issues across 20+ tenants.

This portfolio shows process over outcome—the debugging sessions, the failed attempts, the iterations that led to working solutions.

Technologies

Python, Data Pipelines, Dagster, dlt, ETL/ELT, API Integration, Shopify API, Multi-tenant Architecture, PostgreSQL, Airflow

Experience

5+ years

Information

Kevin Kabeya is a Software Engineer who builds full-stack applications. At Campeus, he built a complete SaaS platform from scratch—71+ models, 161 migrations, multi-tenant architecture, AI-powered features, and complex data migration systems. His work spans Laravel/Livewire web apps, queue-based data processing, and real-time collaboration features.

This portfolio shows process over outcome—the debugging sessions, the failed attempts, the iterations that led to working solutions.

Technologies

Laravel, Livewire, Tailwind CSS, Laravel Queues, Multi-tenant Architecture, PostgreSQL, FastAPI, DuckDB, MotherDuck, Flutter

Experience

5+ years

Python Data Pipelines Dagster dlt ETL/ELT API Integration Shopify API Multi-tenant Architecture PostgreSQL Airflow Data Modeling Snowflake Analytics Engineering Terraform AWS
Laravel Livewire Tailwind CSS Laravel Queues Multi-tenant Architecture PostgreSQL FastAPI DuckDB MotherDuck Flutter Supabase LLM/AI Integration Google APIs Docker CI/CD

Fit Assessment

Paste your job description and get an honest assessment of whether Kevin is a good fit for your role. This tool shows gaps, not just matches.

Evaluating as: Data Engineer

Fit Assessment

Paste your job description and get an honest assessment of whether Kevin is a good fit for your role. This tool shows gaps, not just matches.

Evaluating as: Full Stack Software Engineer

Skills Matrix

An honest assessment of what I know, what I'm learning, and where I have gaps. Transparency builds trust.

Strong

Production experience, can hit the ground running

Python
Primary language for data engineering. 746 lines of production Shopify connector code. Experience with DLT, Dagster, FastAPI, requests, OpenCV.
Data Pipelines
230 commits building extraction layer at Chord. End-to-end from API ingestion to Snowflake warehouse. Replaced Fivetran, saved 60%+ on data costs.
Dagster
230 commits at Chord Commerce. Built multi-tenant orchestration with separate code locations per tenant on ECS. Integrated with DLT for extraction, dbt for transformation.
dlt
Built Shopify, Facebook Ads, Google Ads, BigQuery, and Klaviyo connectors at Chord Commerce. Set nesting level to 0 for control over transformations.
ETL/ELT
Practical experience with both patterns in production. Built extraction layer, worked alongside dbt for transformation.
API Integration
Built production connectors for Shopify GraphQL (20+ endpoints), Facebook Ads, Google Ads, BigQuery, and Klaviyo APIs. Deep experience with rate limiting, pagination, and eventual consistency issues.
Shopify API
Built 746-line production connector with 20+ GraphQL templates. Handled query complexity limits with parent-child patterns, built proactive throttle management using cost headers.
Multi-tenant Architecture
Core to Chord Commerce work: 20+ tenants, isolated ECS code locations, templated logic pulling credentials from Secrets Manager per tenant. Also built multi-tenant SaaS at Campeus.
PostgreSQL
Production database management, query optimization, multi-tenant design. Used at both Chord Commerce and Campeus.
Data Modeling
Designed Databricks data models at BMO as the source of truth for campaign effectiveness. Built attribution models for marketing channels at Campeus to power reporting and decision-making.
Analytics Engineering
Built analytics-ready models and metrics layers at Campeus, including attribution logic for marketing channels and business-facing reporting.
Tableau
Built client-facing Tableau dashboards at Advanced Symbolics for the Ask Polly platform so non-technical stakeholders could self-serve insights.
Power BI
Delivered Power BI dashboards for Wiley Edge clients; stakeholder-facing reporting and iteration based on feedback.
Dashboards & Reporting
Built custom analytics dashboards for Campeus clients and delivered stakeholder-facing BI reporting in Tableau and Power BI.
Analytics Data Architecture
Designed analytics architecture at Campeus: semantic layer + DuckDB/Parquet analytics stack with multi-tenant data isolation and trusted metrics.

Moderate

Has used, still developing expertise

Airflow
Built Airflow pipelines at Advanced Symbolics for AI market research platform, reducing validation time from 4 days to 1 hour.
Snowflake
Destination warehouse for Chord Commerce's Fivetran replacement. Can write SQL, understand architecture. Not a Snowflake admin, but comfortable as a data engineer target.
Terraform
Wrote Terraform for RDS Proxy at Chord: Secrets Manager integration, IAM roles, VPC security groups, KMS policies. Can read and modify infrastructure-as-code.
AWS
Production experience with ECS, Secrets Manager, RDS, IAM, VPC, KMS at Chord Commerce. Can navigate the console, write Terraform, debug production issues. Not a certified architect.
AWS ECS
Debugged production ECS issues at Chord Commerce. Multi-tenant architecture used separate ECS code locations per tenant. Understand container orchestration, can troubleshoot.
GraphQL
Learned on the job at Chord Commerce building Shopify ingestion. Built 20+ query templates, handled pagination and nested queries.
dbt
Used at Chord Commerce for transformation layer. Worked alongside dbt, understood how it fit in the pipeline, but primary focus was extraction not transformation.
Docker
Use for local development. Can write Dockerfiles, basic compose.
OpenCV
Self-taught through album detection project. Feature detection and matching with ORB algorithm.
CI/CD
Can set up GitHub Actions. Understand the concepts, not a DevOps specialist.
System Design
Can design and implement systems. Still developing pattern recognition.

Gaps

Limited or no experience - transparency builds trust

Spark
No production experience. Have read about it, never used it.
Kafka
No experience with event streaming. I find it hard to think of situations outside of IoT devices and high-frequency trading where true real-time is necessary.
Real-time Streaming
My experience is batch-oriented. True real-time is unfamiliar.
Azure
No production Azure experience. Most cloud work has been in AWS.
Kubernetes
Minimal exposure. Would need significant ramp-up time.
Machine Learning
Basic understanding. No production ML experience.

Strong

Production experience, can hit the ground running

Laravel
Built Campeus V2 from scratch—71+ models, 161 migrations, multi-tenant architecture. Learned Laravel conventions under production pressure while migrating a paying customer.
Livewire
Built all interactive components for Campeus using Livewire 4 and Volt. Real-time features with Laravel Reverb, complex form handling, bulk import progress tracking, calendar integration.
Tailwind CSS
Built entire Campeus UI with Tailwind v4. Dark mode support, responsive design, component patterns. Production experience.
Laravel Queues
Built complex bulk import system with chunked queue workers at Campeus. Memory-optimized CSV processing, per-row error tracking, batch job management. Debugged queue serialization issues in production.
Multi-tenant Architecture
Core to Chord Commerce work: 20+ tenants, isolated ECS code locations, templated logic pulling credentials from Secrets Manager per tenant. Also built multi-tenant SaaS at Campeus.
PostgreSQL
Production database management, query optimization, multi-tenant design. Used at both Chord Commerce and Campeus.

Moderate

Has used, still developing expertise

FastAPI
Built Campeus Data API with Pydantic AI semantic layer, Marimo notebook serving, multi-tenant data access. Production experience.
DuckDB
Used for analytics layer in Campeus—DuckDB-WASM in browser with Parquet files for fast client-side queries. Integrated with Marimo notebooks.
MotherDuck
Used for Campeus analytics. Comfortable with DuckDB ecosystem.
Flutter
Built Campeus Pathways mobile app with magic link authentication (Supabase), QR code scanning, event discovery. Broad strokes implementation, not yet complete.
Supabase
Used for Campeus Pathways mobile app authentication. Magic link flow, deep linking, PKCE security. Basic experience.
LLM/AI Integration
Built AI travel planner at Campeus with LLM-powered validation for edge cases. Pydantic AI semantic layer for data chat. Practical experience integrating AI into production workflows.
Google APIs
Used Google Maps API for timezone detection from addresses, Google Routes API for travel route optimization at Campeus.
Docker
Use for local development. Can write Dockerfiles, basic compose.
CI/CD
Can set up GitHub Actions. Understand the concepts, not a DevOps specialist.
System Design
Can design and implement systems. Still developing pattern recognition.

Gaps

Limited or no experience - transparency builds trust

Frontend Frameworks (React/Vue)
No production experience with React or Vue. Used Livewire for frontend interactivity instead.
TypeScript
Limited exposure. Would need to learn for frontend-heavy roles.
Infrastructure/DevOps
Leveraged managed services (Laravel Cloud, Sevalla, PlanetScale)—business logic scales, but no experience managing VPS from scratch, configuring CDN, or DDoS protection.
Kubernetes
Minimal exposure. Would need significant ramp-up time.
Machine Learning
Basic understanding. No production ML experience.

Engineering Journal

Real problems, real solutions, real process. Click "View AI Context" to see the deeper story behind each experience.

July 2025 - January 2026 Production

Chord Commerce

Data Engineer

Led Fivetran Migration Saving 60%+ on Data Costs

Built in-house data extraction layer replacing Fivetran for a multi-tenant e-commerce platform. Developed connectors for Shopify (GraphQL), Facebook Ads, Google Ads, BigQuery, and Klaviyo serving 20+ tenants. Implemented proactive rate limiting using Shopify's throttle budget API.

I like making the pipes that keep the city humming without praise.

The Silent Error Hunt

Data was missing from Snowflake, but no errors in the logs. The jobs completed "successfully." After investigation, discovered our validation hook was catching JSON parsing errors and swallowing them silently. Shopify was returning HTML error pages on rate limits, and our code was treating parse failures as "no data."

Silent failures are worse than loud failures.

Throttling Evolution: Reactive to Dynamic

Four iterations over 34 days. Started catching 429 errors, then exponential backoff, then fixed waits, finally dynamic calculation using Shopify's restore rate: points_needed / restore_rate * buffer, clamped to 1-30 seconds.

The final version uses information that was always available—I just didn't think to use it until I understood the problem deeply.

Multi-Store Architecture

Built support for tenants with multiple Shopify storefronts. get_all_shopify_assets() dynamically generates Dagster assets per store with isolated credentials from AWS Secrets Manager.

Multi-tenant data isolation is harder than it looks—every piece of logic needs to be templatized.

Eventual Consistency: When Missing Data Isn't Missing

Data quality checks flagged orders missing line items. Initial assumption: ingestion bug. After digging through Shopify docs, discovered their eventual consistency model—related objects might not be queryable immediately after the parent.

Adjusted quality checks to be less sensitive rather than alerting on transient inconsistencies.

Python DLT Dagster Snowflake dbt GraphQL Shopify API AWS ECS Terraform Multi-tenant Architecture Data Engineering ETL/ELT Facebook Ads API Google Ads API BigQuery Klaviyo
October 2024 - July 2025 Production

Bank of Montreal (Airmiles)

Data Developer - Martech Focus

Databricks Models for Campaign Effectiveness

Designed Databricks data models as the source of truth for campaign effectiveness across regulated financial-services clients.

Clean analytical models reduce reporting time and ambiguity for stakeholders.

Marketing Ingestion Standardization

Built campaign ingestion in PySpark and Apache NiFi, standardizing marketing platform schemas and cutting quarterly report prep from 3 days to 1.

Standardization pays off when every report depends on the same definitions.

Databricks PySpark Apache NiFi SQL Data Modeling Analytics Engineering
October 2022 - May 2024 Production

Advanced Symbolics Inc.

Data Engineer

Airflow Pipelines for AI Market Research

Built Airflow pipelines feeding ML models for an AI market research platform, reducing validation time from 4 days to 1 hour.

Automation shifts validation work from manual review to repeatable, testable steps.

Tableau Dashboards for Ask Polly

Built client-facing Tableau dashboards so non-technical stakeholders could self-serve insights from the Ask Polly AI platform.

Self-serve BI only works when definitions are consistent and the UI stays simple.

Airflow Python Data Pipelines ETL/ELT Tableau BI Dashboards
January 2022 - April 2022 Production

Wiley Edge

Data Analyst

Implemented ETL for a Power BI dashboard, automating ingestion and transformation, and improving processing efficiency by 50-75%.

Power BI ETL Automation

Implemented ETL to automate data ingestion and transformation for a Power BI dashboard.

Small automation wins compound quickly when dashboards are used daily.

Model Improvements for Insights

Improved data processing efficiency by 50-75% and enhanced data models to produce more actionable insights.

Better modeling upstream makes stakeholder reporting faster and more reliable.

Power BI ETL/ELT Data Modeling Dashboards & Reporting SQL
February 2023 - Present Personal Project

Album Detection

Learning Image Recognition by Building

"Shazam for album covers"

The Algorithm Journey

Started with Google Vision API for text extraction. Failed on stylized covers, partial photos, poor lighting. Discovered OpenCV feature detection. Explored SIFT (patented), SURF (patented), settled on ORB (free and fast enough). The 0.75 ratio in Lowe's ratio test? Tried 0.6, 0.7, 0.8, 0.9. 0.75 balanced false positives and negatives for album covers.

Hours watching feature match visualizations, understanding why matches failed, iterating parameters

Three Architectures

v1: Python script (good for learning, useless in a record store). v2: Flutter + FastAPI on GCP (powerful but heavyweight for a personal project). v3: Streamlit (one line for camera capture, 10x faster iteration on the core algorithm).

Most recent commit: "Change client tech stack from Flutter to Streamlit"

Why It Belongs Here

This isn't my "best" work. Chord pipelines are more sophisticated. Campeus is more complete. But this project shows genuine curiosity—I built it because I wanted to understand image recognition, not because someone asked.

Learning by building something I actually wanted to use

OpenCV Python Streamlit FastAPI Computer Vision ORB Features
Mid-2025 - Present Founding Engineer

Campeus

Co-founder & Founding Architect

Built a complete SaaS platform from scratch for higher education enrollment teams.

Complete Platform Rebuild (V2)

Built a complete SaaS platform from scratch for higher education enrollment teams. 71+ database models, multi-tenant architecture, AI-powered travel planning, complex bulk data import with timezone normalization, real-time analytics with DuckDB/Parquet.

The gap between "I built a side project" and "I shipped production software a customer depends on" is enormous.

Bulk Import System with Data Engineering

Queue-based CSV processing serving a complex migration from legacy system. Configurable column mapping, automatic timezone detection from addresses, UTC normalization, and per-row error tracking. 6+ iterations to get it right.

Production forces you to solve problems you'd never encounter otherwise—data migration, timezone handling, queue worker memory management.

AI Travel Planner

Multi-stage pipeline using Laravel Actions for recruitment travel optimization. Constraint-based location selection, Google Routes API integration, LLM-powered validation for edge cases. Uses historical data (median attendance, typical duration) for intelligent suggestions.

Pipeline architecture with composable actions makes complex AI workflows maintainable and testable.

The "Locations Not Working" Saga

Renamed Location model to OrgLocation for multi-tenant clarity. Spent 6+ hours debugging—Eloquent's convention-based foreign key guessing didn't match my column names. Commit messages: "fixed for real", "fixed for real this time", "fixed for real this time I promise".

Classic first-production-app learning: refactoring requires catching ALL references.

Data Analytics Layer (Separate Service)

Built FastAPI service with Pydantic AI semantic layer, Marimo notebooks for interactive exploration, DuckDB + Parquet for fast analytical queries. Multi-tenant data isolation with pre-signed URLs.

Separating concerns into purpose-built services keeps each system focused and maintainable.

Client Analytics Dashboards

Built custom analytics dashboards for enrollment teams, including attribution-style views of marketing channel performance and outcomes.

Dashboards only matter when they directly change decisions.

The Workshop Revelation

Worked with one contact to build V2, then ran a workshop showing it to the full UOttawa team. Bugs and UX issues I was blind to became immediately obvious.

You gain more insight in 30 minutes of real user testing than in weeks of building in isolation.

Single Customer, Can't Fail

UOttawa was our only paying customer. If V2 didn't work, if the migration failed, we didn't have a business. V2 went live January 21, 2026—only 2 user-generated errors in the first week and a half.

Pressure clarifies priorities. When failure isn't an option, you find a way.

Laravel 12 Livewire 4 PHP 8.4 PostgreSQL PlanetScale Laravel Cloud FastAPI Pydantic AI DuckDB Parquet Dagster Flutter Supabase Multi-tenant SaaS Queue Workers AI/LLM Integration Analytics Dashboards Attribution Modeling EdTech
"The process matters more than the outcome. And you can't fake the process the way you can fake the outcome in the age of AI."

Want to learn about Kevin?

Ask the AI assistant anything about my experience, skills, or work style. It's trained to give honest answers — including about things I don't know.

Ask AI About Me

Powered by Claude - Honest, not salesy

Ask me anything about Kevin's experience, skills, or work style. I'll give you honest answers - including about things he doesn't know.

Suggested questions:

Ask AI About Me

Powered by Claude - Honest, not salesy

Ask me anything about Kevin's experience, skills, or work style. I'll give you honest answers - including about things he doesn't know.

Suggested questions: