D

Divya Gottipati

About

Detail

Richardson, Texas, United States

Contact Divya regarding: 
work
Full-time jobs

Timeline


work
Job
school
Education

Résumé


Jobs verified_user 0% verified
  • OpenAI
    AI Data Engineer
    OpenAI
    Jan 2024 - Current (2 years 6 months)
    • Designed and managed AI-ready, large-scale data pipelines using Apache Airflow and Apache Spark, ingesting and processing 5TB+ structured and unstructured data daily, reducing ingestion latency by 40%. • Developed and deployed end-to-end CI/CD pipelines for machine learning (ML) systems using GitHub Actions, ArgoCD, and Docker, automating testing and deployment of 50+ data and ML microservices, cuting release cycles from days to hours. • Built Retrieval-Augmented Generation (RAG) pipelines leveraging FAISS vector search, OpenAI embeddings, and Python, processing 1M+ documents/day, reducing LLM hallucination rates by 25% and improving answer relevance scores by 18%. • Built feature engineering pipelines using Python, NumPy, Scikit-lear
  • JPMorgan Chase
    Data Engineer
    JPMorgan Chase
    Jun 2020 - Jul 2022 (2 years 2 months)
    • Engineered and implemented a modern data stack using dbt, Snowflake, and modular SQL workflows, transforming raw financial data into analytics-ready models, improving data consistency by 35% and reducing ETL pipeline maintenance by 50%. • Architected a secure AWS-based data lake solution leveraging AWS Glue, Redshift, EMR, and Lake Formation, implementing fine grained access controls, automated metadata cataloging, and data governance policies, reducing data onboarding time by 40%. • Developed real-time financial data processing pipelines using Apache Spark Structured Streaming and Apache Kafka on Hadoop YARN, achieving 60% faster end-of-day risk calculations while reducing cluster costs by 30% via dynamic resource allocation. • Built
  • Cognizant
    Data Engineer
    Cognizant
    Jan 2019 - May 2020 (1 year 5 months)
    • Designed and implemented scalable ETL data integration workflows using Azure Data Factory, Azure Databricks, and SQL Server Integration Services (SSIS), reducing data pipeline latency by 30% across Azure-based data platforms. • Automated CI/CD pipelines for data solutions using Azure DevOps, GitHub Actions, and Terraform (Infrastructure-as-Code), enabling zero-downtime deployments for 20+ ETL jobs, and cuting production incidents by 40%. • Optimized complex SQL queries across Oracle and PostgreSQL databases, leveraging advanced indexing, query plan tuning, and partitioning strategies, reducing report generation time by 55% for enterprise financial applications serving 1,000+ daily users. • Designed and developed Power BI dashboards us
Education verified_user 0% verified
  • The University of Texas
    Master's in Information Technology
    The University of Texas
    Aug 2022 - Aug 2024 (2 years 1 month)
  • M
    Microsoft Certified: Power BI Data Analyst
  • A
    AWS Certified: Solutions Architect
This is a community-created genome.