Projects

A full view of public work across automation, data engineering, AI tooling, and scientific software.

9 projects

Python scraping pipeline that collects job listings, normalizes records, persists run history in SQLite, and exports datasets through CLI workflows.

PythonSQLiteCLI toolingTOML configGitHub Actions+2

Analytics pipeline that transforms raw job snapshots into curated DuckDB/Parquet datasets, bilingual reports, and a public MkDocs documentation site.

PythonDuckDBParquetMkDocsGitHub Actions+1

Repository analysis tool that scans codebases, detects language and framework signals, summarizes dependencies, and outputs structured Markdown or JSON audit reports.

PythonCLI toolingGitHub ActionsJSON/Markdown reportingOptional OpenAI/Anthropic enrichment

Token-aware RAG ingestion pipeline where Rust handles performance-critical chunking via PyO3, Python orchestrates embeddings, and Qdrant stores searchable vectors.

RustPyO3PythonQdranttiktoken-rs+3

Local media automation pipeline that downloads VODs, transcribes Spanish audio, ranks candidate moments by chat activity, uses an LLM to select highlights, and cuts 9:16 MP4 clips.

PythonFFmpegfaster-whisperchat-downloadertwitch-dl+3