N

Nethaji Eddula

About

Detail

Westborough, Massachusetts, United States

Timeline


work
Job

Résumé


Jobs verified_user 0% verified
  • Vertical Relevance
    AWS Cloud Consultant
    Vertical Relevance
    Oct 2025 - Current (10 months)
    • Developing infra for a company who is willing to migrate their data to AWS . • Integrated data pipelines with SageMaker, Bedrock, or Vertex AI, enabling ML training, inference, and LLMOps workflows. • Designed secure database migration architectures using VPC peering, Transit Gateway, VPN/DirectConnect, Security Groups, KMS encryption, and IAM least privilege. • Using Terraform using modules, variables and reusable code to create S3, EC2, RDS. • Developed AWS Console using Terraform. • Designed and optimized complex SQL queries across Snowflake and Oracle, supporting high-volume analytical and operational workloads. • Led end-to-end database migrations from on-prem SQL Server / Oracle / MySQL to AWS RDS and Aurora, using AWS DMS for full-
  • The Home Depot
    GCP Data Engineer
    The Home Depot
    Jan 2023 - Aug 2025 (2 years 8 months)
    • Designed and implemented hybrid pipelines integrating on-prem Kafka with GCP Pub/Sub + Dataflow, reducing ingestion latency by 40% and enabling real-time analytics. • Designed Airflow plugins and hooks for custom connectors to systems like Snowflake, Informatica IICS, and REST-based APIs. • Collaborated with cloud teams to automate Kubernetes infrastructure provisioning with Terraform, eksctl, and AWS IAM roles, ensuring repeatable cluster setup. • Developed and managed scalable ELT pipelines using Matillion to ingest data from multiple source systems into Snowflake. • Optimized windowing, triggers, and side inputs in Beam pipelines to support low-latency analytics. • Migrated legacy batch ETL pipelines into Kafka event-driven pipelines,
  • Vanguard
    Data Engineer specialist
    Vanguard
    Jun 2022 - Dec 2022 (7 months)
    • Worked with Hadoop ecosystem tools (Hive, HBase, Oozie, Pig, Zookeeper, Spark Streaming, MapR Control System) on MapR distribution to process large-scale datasets. • Designed and deployed serverless ETL jobs using AWS Glue + Spark, improving automation and scalability for batch and streaming pipelines. • Designed ETL workflow execution on Kubernetes with init containers, sidecars, and shared volumes, improving pipeline reliability and fault tolerance. • Developed scalable ETL pipelines using AWS Glue, EMR, PySpark, and Spark SQL, processing TB-scale datasets across structured and unstructured sources. • Deployed multi-node Spark jobs on EMR with Python PySpark, scaling to process billions of records efficiently. • Processed streaming IoT
  • O
    Data Engineer II
    Otsuka Healthcare
    Mar 2021 - May 2022 (1 year 3 months)
    • Built and deployed modular pipelines with Apache Airflow DAGs, AWS Glue jobs, and Glue Crawlers through CI/CD, reducing manual deployments by 40%. • Extracted and transformed data from MongoDB, DynamoDB, Snowflake, Redshift, Postgres, and RDS, consolidating multi- source data into governed warehouses. • Designed and managed governed data lakes using AWS S3, Glue, Athena, and Lake Formation, enabling metadata-driven lineage and fine-grained access. • Orchestrated ETL workflows with Apache Airflow + Control-M, improving monitoring and reducing job failures. • Deployed containerized workloads on AWS ECS, ECR, and Fargate, enabling automated scaling and reducing infrastructure overhead. • Designed and deployed serverless ETL pipelines using A
  • T
    Data Engineer
    TekInvaderZ LLC(Centene Corporation)
    May 2019 - Dec 2020 (1 year 8 months)
    • Developed data pipelines using Google Cloud Platform (GCP) services such as IoT Registry, Pub/Sub, Dataflow, BigQuery, DataPrep, Data Studio, and AI Platform. • Led projects enforcing GCP best practices including encryption in transit and at rest, IAM fine-graining, and audit log monitoring. • Monitored DAG performance using Airflow's metadata database and Grafana dashboards, tuning task parallelism and resource utilization. • Mapped security controls for new GCP deployments against CSA Cloud Controls M atrix (CCM) and NIST 800-53 frameworks. • Implemented Airflow REST API integrations for external trigger-based executions and metadata tracking. • Ensured security design and controls aligned with frameworks such as NIST, CSA CCM, ISO 2700
Education verified_user 0% verified
  • Google Cloud
    Google Cloud Architect
    Google Cloud
  • Snowflake
    Snowflake Certified Data Engineer/Architect/Practitioner
    Snowflake
    validating expertise in cloud-based data warehousing and modern data architectures.
  • Panimalar Engineering College
    Bachelor of Technology (B.Tech) in Computer Science & Engineering
    Panimalar Engineering College
    Chennai, India
  • Rivier University
    Master of Science (M.S.) in Data Engineering / Computer Science
    Rivier University
    Nashua, NH
  • Google Cloud
    Google Cloud Professional Data Engineer
    Google Cloud
Projects (professional or personal) verified_user 0% verified
  • Vanguard
    Built hybrid data lakehouse solutions by integrating Iceberg tables with AWS Glue Catalog, S3, and EMR
    Vanguard
    enabling schema evolution, ACID compliance, and cross-engine query support for analytics and ML workloads.
  • O
    Implemented Python-based CI/CD scripts
    Otsuka Healthcare
    for deployment of Lambda functions and Glue jobs using AWS CodeCommit, CodeBuild, and CodePipeline
  • T
    Designed and implemented scalable ETL/ELT pipelines on AWS
    TekInvaderZ LLC(Centene Corporation)
    using Glue, Lambda, and Step Functions, processing structured and unstructured data from multiple sources (S3, RDS, DynamoDB).
  • T
    Developed scalable and fault-tolerant data processing pipelines
    TekInvaderZ LLC(Centene Corporation)
    using Apache Beam (Java SDK) to handle both batch and real-time workloads on Google Cloud Dataflow, ensuring sub-second latency for critical event streams.
  • Vanguard
    Built and optimized Snowflake pipelines supporting ETL/ELT workflows
    Vanguard
    applying clustering keys, partitioning, and denormalization to cut query costs by 20%.
  • O
    Orchestrated end-to-end ETL pipelines
    Otsuka Healthcare
    using Airflow DAGs in Python, managing dependencies, retries, and SLA monitoring.
  • T
    Built error-handling and logging frameworks in Informatica
    TekInvaderZ LLC(Centene Corporation)
    to standardize exception management and improve operational monitoring.
  • T
    Created and deployed applications on GCP
    TekInvaderZ LLC(Centene Corporation)
    using Data Proc, Dataflow, Composer, BigQuery, Bigtable, Cloud Storage, GCS, and various DAG operators.
  • Vanguard
    Built multi-namespace Kubernetes architecture
    Vanguard
    to separate dev, QA, and prod ETL environments, ensuring compliance and isolation across pipelines.
  • T
    Developed pipelines in Python, SQL, and PySpark
    TekInvaderZ LLC(Centene Corporation)
    automating validations and transformations for analytics readiness.
  • O
    Implemented partitioning, clustering, and materialized views in RDBMS systems
    Otsuka Healthcare
    to optimize large-scale analytics queries.
  • T
    Architected secure GCP environments
    TekInvaderZ LLC(Centene Corporation)
    by aligning service deployments with industry standards like NIST, CIS Controls, and CSA Cloud Control Matrix (CCM).
  • T
    Managed multi-region Kubernetes clusters (EKS/GKE/AKS)
    TekInvaderZ LLC(Centene Corporation)
    with federation to ensure high availability and disaster recovery.
  • O
    Migrated legacy ETL workloads from Informatica to PySpark on AWS EMR
    Otsuka Healthcare
    cutting costs and improving scalability.
  • O
    Enabled hybrid data lakehouse architecture
    Otsuka Healthcare
    combining Iceberg + Glue + EMR/Snowflake, improving governance and flexibility in handling batch + streaming workloads.
  • Vanguard
    Enabled real-time customer journey analytics by integrating CDP data with MarTech tools
    Vanguard
    personalized campaigns, improving campaign conversion by 25%.
  • T
    Designed and managed Iceberg table formats
    TekInvaderZ LLC(Centene Corporation)
    integrated with AWS Glue Data Catalog for schema evolution, ACID transactions, and cross-engine query support.
  • T
    Designed and implemented real-time data pipelines on Google Cloud Platform (GCP)
    TekInvaderZ LLC(Centene Corporation)
    using Dataflow, Pub/Sub, and Cloud Composer, enabling seamless ingestion, processing, and orchestration of large-scale streaming and batch data.
  • O
    Integrated Snowflake with AWS Lambda, Glue, and Step Functions
    Otsuka Healthcare
    to deliver cloud-native data pipelines.
  • T
    Containerized PostgreSQL-based microservices
    TekInvaderZ LLC(Centene Corporation)
    using Docker + Kubernetes (GKE), enabling scalable and resilient deployments in cloud-native environments.
  • O
    Built serverless ETL pipelines
    Otsuka Healthcare
    using Lambda, Step Functions, Athena, and Glue Jobs, enabling low-cost, automated data processing for daily and streaming workloads.
  • Vanguard
    Engineered large-scale ETL transformations in PySpark and Spark SQL
    Vanguard
    for high-volume datasets (billions of rows).
  • T
    Defined security requirements for GCP services
    TekInvaderZ LLC(Centene Corporation)
    including data protection, identity management, and workload security, across multi-cloud environments.
  • M
    Monitored pipelines
    using AWS CloudWatch, CloudTrail, and SNS, improving reliability and reducing MTTR.
  • D
    Deployed Airflow DAGs
    using GitLab CI/CD pipelines with environment-based configurations (Dev, QA, Prod) and automated version control.
  • D
    Designed partitioning and bucketing strategies in Redshift + Snowflake
    to improve query performance by 25%.
  • A
    Automated data validation, transformation, and schema enforcement
    using Python Pandas, PySpark, and SQLAlchemy.
  • C
    Collaborated with BI teams
    to build pipelines feeding Tableau and QuickSight dashboards, enabling faster insights.
  • A
    Adopted TDD, BDD, and CI/CD
    to improve code quality and accelerate release cycles.
  • D
    Designed real-time ingestion prototypess with Kinesis + Firehose
    laying the foundation for streaming pipelines.
  • E
    Enforced data governance and compliance (GDPR/CCPA)
    in CDP workflows using Dataplex IAM, DLP, and policy tags, ensuring secure customer data usage.
  • A
    Automated schema evolution and data validation frameworks
    for PostgreSQL sources using Python + PySpark, reducing ingestion failures by 35%.
  • E
    Established IaC frameworks
    with Terraform + CloudFormation, provisioning repeatable environments across Dev/QA/Prod.
  • P
    Partnered with business stakeholders
    to define data engineering standards and best practices, ensuring reusability and governance.
  • I
    Integrated Informatica with metadata and governance tools
    (EDC, Axon, GCP Dataplex) to enhance lineage tracking, auditing, and compliance reporting.
  • R
    Reduced cloud costs by 20%
    through optimization of ECS clusters, Glue jobs, and storage tiers in S3.
  • I
    Integrated SAS datasets
    with Google Cloud Storage and BigQuery for scalable analytical workloads.
  • S
    Spearheaded security hardening efforts
    across GCP services including Cloud SQL, GKE, Cloud Functions, BigQuery, and VPC Service Controls.
  • T
    Implemented workload isolation in Snowflake
    TekInvaderZ LLC(Centene Corporation)
    by configuring virtual warehouses for ETL, BI, and ad-hoc analytics, reducing query contention and improving SLA adherence.
  • I
    Improved team onboarding and delivery
    by documenting data architecture patterns, CI/CD workflows, and coding standards.
  • D
    Designed Terraform + CloudFormation templates to provision EKS clusters
    integrated with CI/CD pipelines, enabling infrastructure-as-code.
  • E
    Enabled customer segmentation pipelines
    in Databricks and Snowflake, pushing enriched segments back into MarTech platforms for targeted campaigns.
  • I
    Implemented data quality checks
    using Glue + Lambda, ensuring compliance with business validation rules.
  • Vanguard
    Developed workflow recovery and restartability strategies in Informatica
    Vanguard
    ensuring data consistency and fault tolerance for long-running ETL jobs.
  • C
    Containerized PostgreSQL-based microservices
    using Docker + Kubernetes (EKS), enabling scalable and resilient deployments in cloud-native environments.
  • O
    Built and orchestrated cloud-native data workflows
    Otsuka Healthcare
    using Apache Hop/NiFi for batch and streaming ingestion, integrating seamlessly with AWS (S3, Redshift, Glue, Lambda).
  • A
    Automated data quality checks
    with AWS Lambda, improving trust in analytics and reducing data errors.
  • B
    Built real-time streaming pipelines with Kafka + Pub/Sub + AWS Kinesis
    to feed event data into Segment/Twilio CDP, enabling near real-time personalization.
  • C
    Configured NiFi processors and templates
    for dynamic routing, schema validation, and error handling in ETL workflows.
  • A
    Automated Kafka cluster provisioning and scaling
    using Terraform + Ansible on AWS/GCP, enabling elastic, cost-efficient operations.