// data & product leader
Mahdi Karabiben
Data and Product leader building products in the data space, with a decade of experience designing and building petabyte-scale data products & platforms. Passionate about open-source projects and actively contributing to the data space through articles, public speaking, online courses, the Data Espresso newsletter, and open-source code. Currently a Senior Product Manager at Neo4j, focusing on graph analytics.
// recent experience
- Dec 2025 — Present
Senior Product Manager — Graph Analytics
@ Neo4j
- Mar 2024 — Dec 2025
Head of Product & Data (Promoted)
@ Sifflet
- Nov 2021 — Apr 2024
Staff Data Engineer (Promoted)
@ Zendesk
- Apr 2020 — Nov 2021
Data Engineer
@ FactSet
// publications & courses
Long-form work
Data Modeling for Data Products
A comprehensive masterclass on data modeling for high-impact and high-value data products. The session details how to develop flexible, domain-driven data models and implement robust governance using essential tools like metric trees and semantic layers, ensuring that business value remains top of mind throughout the process.
courseEnd-to-End Batch Data Pipeline with Spark
A series of four projects authored for Manning's liveProjects platform, walking through every step of building an end-to-end Big Data pipeline with Apache Spark, Delta Lake, and Apache Superset.
course// articles & writing
Selected articles
- 2026
The Data Team's Survival Guide for the Next Era of Data
Six pillars for data teams to declutter their stack, escape the service trap, and build foundations for the new primary consumer: the AI agent.
Towards Data Science - 2025
Data Modeling for Data Products: A Practical Guide
Principles and frameworks for applying data modeling best practices when shipping data products.
Data Engineer Things - 2025
Building a Semantic Layer for the AI Era: Beyond SQL Generation
Capturing the What, Why, and Who for agent functionality — a guide to the semantic layer in the AI era.
Data Engineer Things - 2024
Navigating Your Career Transition in Tech: A Practical Roadmap
A practical guide to a successful career pivot in tech: from making the decision to thriving in your new role.
Data Espresso - 2024
Data Modeling Techniques for the Post-Modern Data Stack
Generic techniques and principles to design a robust, cost-efficient, and scalable data model for your post-modern data stack.
Towards Data Science - 2024
Navigating Your Data Platform's Growing Pains: Data Mess to Data Mesh
Strategies and guiding principles to scale your data platform while maximizing business impact efficiently.
Towards Data Science - 2023
Writing Design Docs for Data Pipelines
The what, why, and how of design docs for data components — and why they matter.
Towards Data Science - 2023
A Simple (Yet Effective) Approach to Unit Tests for dbt Models
An innovative unit testing approach for dbt models — relying on standards and dbt best practices.
Towards Data Science - 2021
Building an End-to-End Open-Source Modern Data Platform
An exhaustive design (with the necessary IaC) for a modern data platform built solely on open-source projects and cloud-provider primitives.
Towards Data Science - 2020
Creating Notebook-based Dynamic Dashboards
A design (and POC) using notebooks to generate dynamic dashboards, supporting a Google-like metadata search engine.
Towards Data Science
// speaking & podcasts
Stage & mic
From Haystack to Insights: Three Ways AI is Transforming Product Analytics
Product-Led-Growth Disrupt Summit
The Data Engineer's Guide to Data Quality Testing: Fun, Easy, and Scalable
Data Innovation Summit
The Third Wave of Data Technologies
The Modern Data Show — S01E02
A Practical Case Study for Data Engineers: Performing Data Quality at Scale
Big Data Expo