Project's objective

The objective of this project is to establish a data pipeline for processing compressed files using AWS services. This pipeline is designed to perform a series of ETL (Extract, Transform, Load) processes to extract raw data, transform it into tables, and load it into a data warehouse for analysis and data querying purposes.

The project encompassed a range of tasks, including:

  • Setting up storage (Buckets) in AWS S3 to store data.
  • Implementing a Lambda function for file processing and creating roles and access policies for S3 buckets using AWS IAM.
  • Leveraging AWS Glue functionalities for ETL processes (crawler, database).
  • Orchestrating services through AWS Step Functions.

Project information

  • Category: Big Data & Cloud computing
  • Project date: 11/2021
  • Project Presentation : PPTX Presentation

Technologies

Technology 2
Technology 2