about

About

Hello, I'm Miguel.

I’m a data and AI engineer based in Berlin. I’ve spent my career building data, ML, and AI systems. From an MSc thesis on word2vec for German document retrieval at TU Munich, to NLP pipelines over hotel reviews, to demand forecasting for fast fashion, to product analytics at a publicly-traded e‑commerce platform, and now to LLM-driven sales intelligence at a B2B SaaS startup.

I’m currently Head of AI at Plato, a Berlin B2B SaaS building sales intelligence for distributors and wholesalers. Our customers are wholesale companies with thousands of sales reps making customer visits every week; our job is to turn their messy enterprise data into the three or four things a sales rep actually needs before walking into a meeting. I joined when the AI/Data function was zero people. I’ve grown it into a small team that owns Sales Insights, Document Processing, and Search. Day-to-day I’m part architect, part hiring manager, part hands-on builder, part company-internal storyteller. Most weeks I’m doing all four.

Before Plato I was a Senior Data Scientist at Shopify working on product analytics and predictive models on the merchant platform; before that Senior/Staff Data Scientist at New Yorker, where I led a small multidisciplinary team on demand forecasting and inventory optimization for a European fast-fashion retailer; and before that Senior Engineer / Tech Lead at TrustYou in Munich, designing NLP and information-retrieval pipelines that extracted structured signals from millions of hotel reviews. My first production ML role was at Gini, doing document classification with word vectors and LibSVM (back when that was the leading edge).

Education: BEng Systems & Informatics from Universidad Nacional de Colombia (Medellín), MSc Informatics from TU Munich, plus the CDTM Honours Master in Technology Management (joint TUM/LMU).

What I write about

This site is mostly a notebook. I write when I’ve learned something concrete and I want to keep it findable later. The topics tend to cluster:

  • Data engineering. Pipelines, validation, the gap between “the notebook works” and “the pipeline runs at 3 a.m. on a different tenant’s data.”
  • Machine learning in production. What changes when models leave the lab; what doesn’t; what to instrument so you’ll know which it is next time.
  • Python tooling and software practice. Type hints, testing, the small things that make code outlast the person who wrote it.
  • Agentic systems and Claude Code. What works, what’s hype, what’s actually shippable inside a real production codebase.

How I work

Things I’ve come to believe after a decade and a half of doing this:

  • Deployment beats brilliance. The best model is the one that’s actually serving users next week, not the one that beat SOTA on a benchmark you’ll never re-run.
  • Boring infrastructure is competitive advantage. Every minute spent on a reliable build, a sane data model, or a clean deploy is a minute that compounds.
  • Write things down. For your future self, for your team, and now for the AI agents that work alongside you. The shape of your CLAUDE.md, your skills, your runbooks: that’s becoming as important as the shape of your code.
  • Generalists who can lead beat specialists who can’t. Eight years deep on one stack is great until the platform shifts under you.

Now

(Updated May 2026)

  • Building AI tooling for sales reps at Plato. Recently: a multi-tenant onboarding pipeline that took us from days of manual config per customer down to about 30 minutes. A talk about it at the Databricks Berlin User Group in April 2026.
  • Leading the AI/Data team at Plato.
  • Writing weekly on agentic systems, knowledge scaffolding, and production AI for B2B.
  • Reading: anything on context engineering, recsys with LLMs, and the practical edges of where agents stop being demos and start being infrastructure.
  • Out of the office: Brazilian Jiu-Jitsu, cycling around Berlin on a cargo bike with my five-year-old, surviving the early months of life with a newborn.

Community

I’ve spent a fair amount of time in the European Python and data community. I founded Munich DataGeeks (2013 to 2015), co-organized the PyData Berlin Conference (2017) and PyBerlin Meetup (2019 to 2020), and have spoken at PyData Berlin, PyCon DE, EuroPython, DSPT Day Porto, and the Databricks Berlin User Group. The Talks page has slides for most of them. I’ve made small contributions to Luigi, Gensim, Catboost, Hooqu, and scikit-learn over the years.

Talk to me

I’m always happy to swap notes on data engineering, production ML, agentic systems, or what it’s like to build an AI team from zero inside a B2B SaaS. Easiest places to reach me:

DM me on any of those and I’ll happily share my email if we want to take the conversation off-platform.