All projects

Mexico Jobs Analytics Pipeline

data-engineeringautomationdashboard

Screenshot coming in Phase 3

Problem

Raw job listing snapshots are useful only if they become clean, queryable, repeatable analytics assets.

Solution

Built an analytics pipeline that transforms raw job snapshots into curated datasets, bilingual reports, and a public documentation site with automated publishing.

Deliverables

  • Curated DuckDB/Parquet datasets
  • Weekly/monthly bilingual reports
  • Public MkDocs site
  • GitHub Actions workflow
  • Local/cloud execution paths
  • Reproducible pipeline CLI

Why it matters

  • Delivers a public analytics site that updates automatically — not a one-time report
  • DuckDB + Parquet keeps the dataset queryable and shareable without a database server
  • Bilingual outputs serve both English and Spanish stakeholders from the same pipeline
  • GitHub Actions workflow means zero manual intervention after initial setup

Tech Stack

PythonDuckDBParquetMkDocsGitHub ActionsCloud Run-ready workflow

Services

Data EngineeringETL PipelinesBusiness AnalyticsDashboard/Reporting AutomationCloud Application Development