All projects
LinkedIn Job Scraping & Data Export Pipeline
web-scrapingautomationdata-engineering
Screenshot coming in Phase 3
Problem
Teams often need recurring job-market or competitive-market datasets, but manual collection is slow, inconsistent, and hard to reuse.
Solution
Built a Python scraping pipeline that collects job listings, normalizes records, persists run history, and exports reusable datasets through CLI workflows.
Deliverables
- Scraping library
- CLI commands
- SQLite-backed persistence
- Export workflows
- Managed artifacts
- Documentation
- CI/docs/release automation
- Optional OpenAI enrichment
Why it matters
- Turns manual job-market research into an automated, repeatable data product
- CLI-driven — anyone on the team can run it, not just the person who built it
- SQLite history lets you track market changes over time without re-scraping from scratch
- Optional OpenAI enrichment adds structured tagging to raw scraped records
Tech Stack
PythonSQLiteCLI toolingTOML configGitHub ActionsPyPIOpenAI API
Services
Web ScrapingData ExtractionPython AutomationETL PipelinesReporting Automation