Site Reliability Engineer (SRE) at TEKsystems Global Services (TGS) | Torre

Site Reliability Engineer (SRE)

You will engineer cloud-native reliability and scalability for a large-scale Azure Kubernetes transformation.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Compensation is to be agreed upon.
location_on
Remote (for Japan residents)
Shared by
Emma of Torre.ai
2 days ago

Requirements and responsibilities


We are hiring an experienced bilingual SRE to support a large-scale cloud transformation program. You will play a key role in ensuring reliability, scalability, and production stability for a modern cloud-native platform.Key responsibilitiesDesign and operate Azure + Kubernetes (AKS) platformsBuild scalable, highly available systems and define SLOs/SLIsLead monitoring and observability (Prometheus, Grafana, Datadog, ELK)Drive incident management, RCA, and continuous improvementAutomate infrastructure using Terraform, CI/CD, and GitOpsRequirementsStrong SRE / DevOps experienceHands-on Azure and Kubernetes (AKS) expertiseExperience in large-scale production environmentsSolid monitoring, networking, and incident management skills
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.