V

Venkata Gupta Penugonda

About

Detail

Timeline


work
Job

Résumé


Jobs verified_user 0% verified
  • Bank of America
    Senior Data Engineer
    Bank of America
    Oct 2023 - Current (2 years 9 months)
    • Enterprise Cloud Modernization for Banking Data Lake – Migrated legacy batch workflows to a cloud-native data lakehouse architecture on Azure, centralizing enterprise financial and transactional data. • Led the end-to-end development of scalable Azure Data Factory (ADF) pipelines, orchestrating ingestion of structured and semi-structured data from over 15 banking systems into Azure Data Lake Storage (ADLS) and Azure SQL Database. • Designed and implemented a modular Delta Lake architecture using Azure Databricks, enabling ACID transactions, schema enforcement, and optimized time-travel query support for historical financial data. • Improved batch ETL performance by 45% by refactoring legacy SSIS packages into ADF pipelines with dynami
  • Root Insurance
    AWS Data Engineer
    Root Insurance
    Nov 2022 - Sep 2023 (11 months)
    • Real-Time Telematics and Claims Data Platform – Engineered an end-to-end AWS-based solution to process vehicle telemetry and claims data for fraud detection, underwriting, and personalized policy pricing. • Designed and implemented scalable ETL pipelines using AWS Glue, ingesting and transforming JSON/CSV/parquet data from S3, API Gateway, and third-party telematics platforms. • Built near real-time ingestion and processing flows using Kinesis Data Streams, Lambda, and AWS Glue Streaming, reducing claim fraud detection latency from hours to under 10 minutes. • Optimized Athena and Redshift Spectrum queries for analytical workloads, improving performance by 35% for data scientists and actuaries consuming vehicle and behavioral risk mod
  • Novo Nordisk
    Data Engineer
    Novo Nordisk
    Jun 2021 - Jul 2022 (1 year 2 months)
    • Clinical Trial Analytics and Pharma Data Integration – Developed a hybrid data engineering solution supporting global trial data processing, patient adherence tracking, and regulatory submissions. • Built automated batch pipelines using Apache Airflow and Python, integrating EHR and clinical trial data from flat files, APIs, and relational databases into a centralized data warehouse. • Leveraged Google Cloud Storage (GCS), Cloud Composer, and BigQuery for secure storage and querying of anonymized patient datasets to support pharmacovigilance analytics. • Developed ETL jobs using Informatica and custom PySpark scripts for transforming trial protocol, adverse event, and medication data into standardized formats (CDISC, SDTM). • Enabled
  • Bayer
    Junior Data Engineer
    Bayer
    Mar 2020 - May 2021 (1 year 3 months)
    • Crop Analytics & Pharmaceutical Supply Chain Data Platform – Supported the development and maintenance of batch ETL processes to unify data from research labs, field trials, and production systems. • Assisted in building ETL pipelines using Talend and Python, enabling the extraction and integration of crop genetics, chemical trial data, and logistics datasets from multiple silos into enterprise warehouses. • Supported ingestion of clinical product data from SAP and LIMS systems into SQL Server and PostgreSQL databases using data quality and mapping rules for consistency. • Developed and maintained scheduled jobs via Apache NiFi and Talend JobServer to automate file transfers and validation processes across research locations. • Creat
Education verified_user 0% verified
  • University of South Florida
    Masters in
    University of South Florida
This is a community-created genome.