Browser-Automation & Crawling Infrastructure Services (Python + Node)
A leading electronic-music marketplace
Overview
The supporting infrastructure layer for the marketplace automation effort: a mix of Python and Node services providing crawling, browser automation, CAPTCHA handling and API/transaction utilities that back the higher-level AI agent.
Why It Exists
LLM agents are only as capable as the browser/crawl plumbing beneath them. This repository supplies the reusable infrastructure, page fetching, anti-bot handling, crawling and API calls, so the agent can focus on decision-making.
What We Built
A multi-language service collection:
- A Python
playwright-service-pythonexposing anapi.pyand a marketplace-specific driver module, with acache_towercache, crawlstorage, image assets for CAPTCHA targets, and build scripts, a standalone Playwright automation service. - Python crawling via
crawlee.pyand arecaptcha.jsreCAPTCHA-handling routine for anti-bot scenarios. - Node utilities (
index.js,get-transactions.js,stats.js) using Axios andjsonwebtokenfor authenticated API access and transaction/stats retrieval. - A
venvandrequirements/package.jsonreflecting the dual Python+Node runtime.
Technologies & Approach
Crawlee and Playwright provide robust, scalable crawling and headless browser control in Python; Node services handle JWT-authenticated API calls and reporting. Splitting the infrastructure from the agent keeps automation primitives reusable and independently deployable.
Outcome / Impact
Provided the crawling and browser-automation backbone that powers the marketplace AI agent, including CAPTCHA/anti-bot handling and authenticated data retrieval.
Capabilities Demonstrated
- Building browser-automation and crawling infrastructure with Playwright and Crawlee
- Handling anti-bot/CAPTCHA challenges programmatically
- Polyglot service design (Python + Node) for automation platforms
- JWT-authenticated API integration and data extraction