Site Reliability Engineer
DISQO
Apr 2021 - Nov 2022 (1 year 8 months)
• Maintaining Infrastructure as code and setting up of Kubernetes cluster and services on AWS.
• Designing and implementing CI / CD pipelines for a range of microservices leveraging the use of GitLab CI and ArgoCD to ship at speed.
• Driving reliability efforts, including adding monitoring, alerting, deployment practices, and resiliency. This effort significantly increased the percentage of critical infrastructure teams that had effective monitoring and alerting.
• Lead post-mortem incident reviews to identify the root cause analysis and ensure remediation, and created a process for write-ups of incident learnings that could be communicated broadly throughout DISQO.
• Improved our "Security posture" effort from scratch to 75% improvement an