I

Irfan Shah

About

Detail

United States

Timeline


work
Job
folder
Project

Résumé


Jobs verified_user 0% verified
  • Rockset
    Senior Data Architect
    Rockset
    Oct 2020 - Current (5 years 9 months)
    • Designed and implemented a scalable enterprise data warehouse using Snowflake and Databricks, enabling seamless analytics across multi terabyte datasets. • Led the migration of on-prem databases to AWS, leveraging Glue, Redshift, and Kinesis, reducing infrastructure costs by 30% and improving data accessibility. • Architected real-time and batch data pipelines using Apache Airflow, DBT, and Kafka, ensuring efficient data processing for business-critical applications. • Established data governance frameworks, implementing RBAC, encryption, and data lineage tracking to ensure compliance with GDPR and SOC 2. • Mentored a team of data engineers, fostering best practices in data modeling, ETL optimization, and cloud data management.
  • RudderStack
    Lead Data Engineer
    RudderStack
    Mar 2019 - Sep 2020 (1 year 7 months)
    • Developed high-performance ETL pipelines using Apache Spark and Hadoop, processing billions of records daily for advanced analytics and reporting. • Spearheaded the adoption of CI/CD for data pipelines with GitHub Actions and Terraform, accelerating deployment cycles by 40%. • Designed a hybrid data architecture integrating Azure Synapse, PostgreSQL, and Cassandra, supporting both OLAP and OLTP workloads. • Optimized SQL queries and indexing strategies, improving query performance by 60% and reducing data processing time from hours to minutes. • Led cross-functional collaborations with ML Engineers and Business Analysts to deliver data-driven insights and predictive analytics solutions. Additionally, utilized MagicDraw for designi
  • D
    Big Data Engineer
    DataForge Labs
    Jun 2016 - Feb 2019 (2 years 9 months)
    • Engineered real-time data streaming solutions using Apache Kafka and Flink, enabling near-instantaneous data processing for fraud detection. • Implemented Change Data Capture (CDC) techniques with Debezium and Kinesis, ensuring seamless data synchronization across distributed systems. • Designed and deployed containerized data workflows using Docker and Kubernetes, improving scalability and resource utilization. • Automated data security and compliance workflows, including encryption, masking, and IAM policies, ensuring adherence to HIPAA and CCPA regulations. • Conducted strategic tool evaluations and vendor selections, optimizing technology investments and enhancing overall data infrastructure resilience. • Utilized SQL for wr
Education verified_user 0% verified
  • COMSATS University
    Bachelor's Degree in
    COMSATS University
Projects (professional or personal) verified_user 0% verified
  • RudderStack
    Data Governance / Compliance Projects
    RudderStack
    Mar 2019 - Aug 2020 (1 year 6 months)