I will build an AWS data lake and etl pipeline using pyspark

Sommige informatie wordt in het Engels weergegeven.

Pakistan

Ik spreek Engels

Cloud Data Engineer building scalable ETL pipelines

Hi, I'm an independent Data Engineer specializing in building scalable ETL pipelines and robust cloud data architectures. I help businesses transform messy, unstructured logs into clean, query-ready d...
Over deze dienst

As a Data Engineer, I design robust cloud-native architectures and scalable ETL pipelines. Whether processing high-volume logs or building Medallion Data Lakes, I deliver clean, optimized solutions.

What I Offer:

  • End-to-End ETL Pipelines: Automated data extraction, transformation, and loading using Python and PySpark.
  • Cloud Data Lakes: Architecting serverless Medallion Data Lakes (Bronze, Silver, Gold) on AWS (S3, Glue, Athena).
  • Database Architecture: Designing relational databases (3NF) and optimizing complex SQL queries (CTEs, Window Functions) in PostgreSQL.
  • Performance Optimization: Reducing data processing times and cutting storage costs using formats like Apache Parquet.

Tech Stack: AWS (S3, Glue, Athena) | PySpark | Python | PostgreSQL | Advanced SQL | Git/GitHub

Why choose me? I write production-ready code, ensure scalable designs, and strictly follow data engineering best practices.

Please message me before ordering to discuss your exact project!

Taal:

Engels

Urdu

Technische expertise:

dbt (Data Build Tool)

Apache Airflow

Expertise:

Datapijplijnen

ETL-ontwikkeling

Data-integratie

Branche:

Gegevensanalyse

Mijn portfolio