Data Engineer
British Petroleum
Jun 2021 - Feb 2022 (9 months)
• Developed a serverless data application from scratch that takes cross-country oil financial data from different regions and performs transformations based on region-specific business logic and loaded the data in parquet format to AWS S3.
• Extracted data from multiple sources systems and created tables/schemas in the Glue Catalog by creating Glue Crawlers
• Automated using AWS Step Functions, Cloudformation, Lambda, and CI/CD with Azure devops. Reduction of manual effort by at least 50%.
• Built pyspark scripts and transformation using data frames and spark-sql for data aggregation, queries, and writing back into S3.
• Utilized AWS serverless services like Step Function, Lambda, Glue, Redshift Spectrum, Athena, and Cloudwatch.
• Clos