S

Shane K

About

Detail

New York, United States

Timeline


work
Job

Résumé


Jobs verified_user 0% verified
  • Pythian
    Lead Data Engineer
    Pythian
    Jan 2022 - Current (4 years 7 months)
    Led design and implementation of cloud-native ETL/ELT pipelines (Airflow, dbt, Fivetran, PySpark) processing 5TB+ daily across AWS, Azure, and GCP multi-cloud environments. Architected real-time streaming solutions with Kafka, Flink, and Delta Lake, enabling Al analytics and predictive dashboards with <2s latency. Directed AI/ML data pipelines with RAG workflows, vector databases (Pinecone, FAISS), and MLOps (MLflow) for continuous model updates and inference. Oversaw multi-cloud infrastructure deployment using Terraform, Kubernetes (EKS/GKE), and GitHub Actions, ensuring CI/CD automation, laC governance, and cost optimization. Championed observability and monitoring using Prometheus, Grafana, Monte Carlo, and OpenLineage, reducing downtime
  • Belcan
    Senior Data & Cloud Engineer
    Belcan
    Sep 2019 - Dec 2021 (2 years 4 months)
    Built multi-cloud ETL/ELT pipelines (Airflow, dbt, Fivetran, PySpark) integrating patient and operational data across AWS, Azure, and GCP for Al-powered healthcare analytics. Developed real-time streaming & CDC pipelines with Kafka, Flink, and Debezium, ensuring timely delivery of critical patient alerts and clinical insights. Designed data models & warehouse architectures (Star, Snowflake, Data Vault) aligned with healthcare standards (FHIR, HL7), enabling compliant analytics and reporting. Implemented data quality & validation frameworks (dbt tests, Great Expectations, Python scripts) to ensure HIPAA compliance, data integrity, and operational reliability. Collaborated with ML teams to productionize predictive healthcare models, leveragin
  • TRANSCURE
    Data Engineer
    TRANSCURE
    Jul 2016 - Aug 2019 (3 years 2 months)
    Built and maintained high-volume ETL pipelines using Airflow, Spark, Python, and SQL, processing financial market and trading datasets with sub-second latency. Automated infrastructure provisioning and deployment with Terraform, Ansible, PowerShell, Docker, and Kubernetes (EKS/AKS), improving reliability and reducing manual overhead. Monitored and optimized pipeline performance and system health using CloudWatch, Azure Monitor, Datadog, and Prometheus, ensuring 99.9% uptime for critical trading data. Developed data validation, quality, and lineage frameworks with Great Expectations, dbt tests, and Python scripts, ensuring accurate financial reporting and compliance with SOC2 and PCI DSS standards. Built REST and GraphQL APIs exposing tradin
Education verified_user 0% verified
  • N
    B.A. of Information System
    New Jersey Institute of Technology, Newark, NJ