M

Mark Bergeron

About

Detail

Monson, Maine, United States

Contact Mark regarding: 
work
Full-time jobs

Timeline


work
Job
school
Education

Résumé


Jobs verified_user 0% verified
  • Cohere
    Senior Infrastructure Engineer - AI/ML
    Cohere
    May 2021 - Current (5 years 1 month)
    • Built GPU inference clusters using Terraform, Amazon EKS, and Amazon Linux 2023 AMIs with pre installed NVIDIA drivers on g5 GPU instances, enabling deployment of 14 production models with zero manual provisioning. • Developed 18 CUDA/TensorRT-based Docker images via CI pipelines, reducing container build time by 40% and supporting 50+ model rollouts through Amazon ECR and EKS. • Packaged model runtime as Helm charts for private Kubernetes deployments, enabling 17 enterprise customers to self-host with documented onboarding workflows. • Implemented custom Kubernetes autoscaler integrated with AWS Auto Scaling Groups, reducing peak inference latency from 320ms to 256ms (20%) during 3x traffic spikes. • Deployed observability stack wit
  • Capgemini
    Senior DevOps Engineer
    Capgemini
    Aug 2018 - Apr 2021 (2 years 9 months)
    • Provisioned 12 EKS clusters using Terraform with isolated dev/test/prod environments and secure IAM roles, supporting 80+ microservices for HMRC. • Engineered 24 GitLab CI pipelines integrating Amazon ECR and EKS, reducing ML deployment time from 45 minutes to under 10 minutes. • Delivered 15 reusable Terraform and Ansible modules standardized across 4 client projects, minimizing configuration drift and enabling reproducibility. • Facilitated 6 Terraform/Kubernetes workshops and pair-programmed with 30+ consultants, decreasing support requests by 70%. • Led FinOps initiative by designing QuickSight dashboards and tagging audits, identifying $220,000/year in unused EC2/GPU resources.
  • NVIDIA
    AI Infrastructure Engineer
    NVIDIA
    Sep 2015 - Jul 2018 (2 years 11 months)
    • Automated GPU container delivery with Jenkins and GitLab CI runners, cutting image release time from 2 hours to 30 minutes and enabling nightly validation. • Provisioned 50+ EC2 GPU instances using Terraform and Ansible to support isolated benchmarking environments for research teams. • Operated Kubernetes GPU clusters using NVIDIA device plugin, raising GPU utilization from 55% to 87% across shared infrastructure. • Maintained Helm charts and custom controllers for Triton Inference Server adopted by 8 internal teams, reducing deployment time by 80%. • Contributed to NVIDIA GPU Operator and showcased internal tooling at GTC, influencing adoption in 3 product divisions.
Education verified_user 0% verified
  • University of New Haven
    Bachelor of Science
    University of New Haven
    Jan 2011 - Jan 2015 (4 years 1 month)
This is a community-created genome.