
The project delivered a fully operational, scalable AWS-based data platform featuring both data lake (S3) and data warehouse (Redshift), handling large-scale data migrations, robust ETL pipelines, and built for future AI/ML readiness.
End to End ETL pipeline in AWS - Redshift, PySpark, Glue, EMR, Hudi, Airflow
Stack: AWS S3, AWS Redshift, AWS Glue, EMR, Apache Hudi, MWAA, Airflow, Athena, Lake Formation