l
leonidlupko

Leonid L

@leonidlupko

Data Engineer, Web Scraping, AWS, GCP ETL

Oekraïne
Engels, Oekraïens
Sommige informatie wordt in het Engels weergegeven.
Over mij
Data Engineer specializing in web scraping, data pipelines, and cloud platforms. I build scalable systems for extracting, processing, and analyzing large datasets with a focus on performance and cost-efficiency. Expertise: - Web scraping (Cloudflare, Akamai, DataDome bypass) - AWS & GCP serverless pipelines - BigQuery, Athena data warehouses - API integrations & automation You get clean data, scalable solutions, and fast, reliable delivery. Let’s build your data solution.... Lees meer

Skills

l
leonidlupko
Leonid L
offline • 
Gemiddelde reactietijd: 1 uur

Bekijk mijn diensten

Data engineering
I will automate API ingestion into bigquery with python

Portfolio

Werkervaring

Self-Employed

High-Load Web Scraping Platform (AWS)

Self-Employed • Freelance

Jan 2025 - Present1 yr 4 mos

Designed and implemented a scalable web scraping platform using serverless AWS infrastructure. Currently running in production for price monitoring across 120000 SKUs on 6 websites (total 0,72M SKUs ), with reliable change tracking and stable daily execution. The system uses curl_cffi for high-performance requests and integrates with Bright Data to bypass anti-bot protections (Cloudflare, Akamai, DataDome). Architecture: Distributed workers (AWS Lambda / ECS) with SQS queues S3-based data lake (raw → normalized → curated) Parquet + partitioning SQL analytics via Amazon Athena Scalability: 🚀 Designed to scale up to 5M+ pages/day Horizontal scaling via queue-based architecture Ready for TB-scale datasets Results: ⚡ 200–800 ms average request latency 💰 60–85% cost reduction vs browser-based scraping 📦 Efficient data pipeline with optimized storage 🔍 Athena queries in 2–10 seconds 📉 $0.01–$0.20 per query