Botasaurus Scraping Service with Web UI
Overview
A scraping service built on the open-source Botasaurus framework, paired with a backend, a frontend and a task-runner so scrape jobs can be defined, executed and have their results collected through a UI rather than ad-hoc scripts.
Why It Exists
Botasaurus packages anti-detection browser and request scraping behind a clean task model. This project adopted it as a starter and extended it into a small service, demonstrating how to operationalise a scraping framework with a UI and persistent results instead of one-off scripts.
What We Built
Starting from the official Botasaurus starter template, we built out a backend/ (with scrapers.py and JS task inputs), a src/ scraping task (scrape_heading_task.py), a frontend/, a main.py/run.py runner, and a SQLite database for persistence, all wrapped in a Dockerfile and docker-compose.yaml. Results land in a task_results/ directory. The structure mirrors a task-definition → execution → results pipeline.
Technologies & Approach
Botasaurus + botasaurus-server (Python) for the scraping engine and task server; a Django-style backend and a frontend for control and visualisation; Docker Compose to run the whole thing reproducibly; SQLite for lightweight result storage. The framework choice gives built-in stealth and a structured task model out of the box.
Outcome / Impact
A working, containerised scraping service demonstrating how an OSS scraping framework can be turned into an operable, UI-driven tool with persistent task results.
Capabilities Demonstrated
- Building scraping services on a task-oriented framework (Botasaurus)
- Wrapping scrapers in a backend + frontend + runner with persistence
- Dockerised, reproducible scraper deployment