Scalable Data Lakes and Data Warehouses on AWS

Scalable Data Lakes and Data Warehouses on AWS

The project delivered a fully operational, scalable AWS-based data platform featuring both data lake (S3) and data warehouse (Redshift), handling large-scale data migrations, robust ETL pipelines, and built for future AI/ML readiness.

End to End ETL pipeline in AWS - Redshift, PySpark, Glue, EMR, Hudi, Airflow

Stack: AWS S3, AWS Redshift, AWS Glue, EMR, Apache Hudi, MWAA, Airflow, Athena, Lake Formation

Leave a Reply

Your email address will not be published. Required fields are marked *