Djeinaba Ba

Development of a data pipeline for files processing using AWS

Home
Project Details

Project's objective

The objective of this project is to establish a data pipeline for processing compressed files using AWS services. This pipeline is designed to perform a series of ETL (Extract, Transform, Load) processes to extract raw data, transform it into tables, and load it into a data warehouse for analysis and data querying purposes.

The project encompassed a range of tasks, including:

Setting up storage (Buckets) in AWS S3 to store data.
Implementing a Lambda function for file processing and creating roles and access policies for S3 buckets using AWS IAM.
Leveraging AWS Glue functionalities for ETL processes (crawler, database).
Orchestrating services through AWS Step Functions.

Project information

Category: Big Data & Cloud computing
Project date: 11/2021
Project Presentation : PPTX Presentation

Technologies

Development of a data pipeline for files processing using AWS ​

Project's objective

The project encompassed a range of tasks, including:

Project information

Technologies

Development of a data pipeline for files processing using AWS