
Sripada Arun
Azure Data Engineer
Skills

Bekijk mijn diensten

Portfolio
Werkervaring
Azure Data Engineer
Nike • Fulltime
Dec 2024 - Present • 1 yr 5 mos
• Accelerated retail data delivery by architecting end-to-end ETL/ELT pipelines on Azure Data Factory and Microsoft Fabric integrating 5+ source systems (Azure SQL, ADLS Gen2, Blob Storage, Snowflake, AWS S3) reducing pipeline onboarding time for new data sources by ~40%. • Improved hybrid ETL reliability by building and optimizing SSIS packages and integrating them with Azure Data Factory, eliminating manual handoffs between on-premises and cloud workflows and cutting pipeline failure rates significantly. • Enabled unified analytics across cloud platforms by implementing Microsoft Fabric Lakehouse Architecture and Snowflake as central data stores, consolidating previously siloed retail data assets into a single query-ready layer for business teams. • Reduced data latency for business reporting by implementing full and incremental load strategies across Azure Synapse Analytics, ADF, and Microsoft Fabric supporting multi-layered Bronze/Silver/Gold architectures that cut refresh cycles by up to 60%. • Increased pipeline throughput by 3× by applying PySpark and Spark SQL optimizations in Azure Databricks including partition pruning and broadcast joins while managing Delta Lake tables in ADLS Gen2 for scalable and ACID-compliant storage, enabling large-scale retail transaction data to be processed within SLA windows. • Delivered self-service reporting for retail stakeholders by building Power BI dashboards using DAX, Power Query, and Star Schema modeling, reducing ad hoc data requests to the engineering team by an estimated 30%. • Strengthened data governance and security posture by enforcing RBAC, Azure Key Vault, and Azure Active Directory (AAD) across ADF and Synapse workflows, improving secure access management and reducing credential exposure risks in production environments.
Azure Data Engineer
RBC • Fulltime
Jun 2023 - Dec 2024 • 1 yr 6 mos
• Streamlined banking data ingestion for 10+ source systems by engineering ETL pipelines on Azure Synapse Analytics and ADF applying PySpark, Spark SQL, and Python transformation logic reducing data processing time for large transactional datasets by ~35%. • Eliminated legacy ETL bottlenecks by migrating SSIS packages from on-premises SQL Server to Azure Data Factory cutting manual pipeline maintenance overhead by an estimated 50% and enabling fully automated cloud-based orchestration. • Built a scalable Lakehouse Architecture for banking analytics by designing Synapse SQL Pools (dedicated and serverless), external tables, and views enabling high-performance ad hoc queries on ADLS Gen2 datasets without data duplication. • Improved pipeline performance for high-volume banking datasets by applying advanced Spark optimization techniques partitioning strategies and broadcast joins in both Synapse and Databricks, resulting in significantly higher throughput on datasets exceeding 100M+ rows. • Reduced data quality incidents by 40% by authoring Python scripts for automated data acquisition, transformation validation, and profiling leveraging SQL and Kusto queries across Synapse Notebooks to catch anomalies before downstream consumption. • Accelerated risk reporting turnaround by building Power BI reports and interactive Tableau dashboards for banking analysts consolidating transactional and risk data into a single source of truth and reducing manual report generation time by an estimated 4 hours per week. • Ensured data accuracy across financial pipelines by creating and managing data mapping routines in Azure SQL with proactive performance monitoring and tuning that maintained 99%+ pipeline uptime across integrated banking systems.