System Reliability Engineer (SRE) at MPS Group | Torre

System Reliability Engineer (SRE)

You'll build AWS infrastructure with IaC and monitoring to ensure reliability, scalability, and security.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Depends on the location of the candidate

Currency exchange and taxes to be paid by:

Depends on the location of the candidate

Provide your expected compensation while applying
location_on
Hybrid (Aeropuerto Internacional de Atlanta (ATL), North Terminal Parkway, Atlanta, Georgia, EE. UU.)
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Posted 9 months ago

Requirements and responsibilities


Responsibilities: 1. Design, build, and maintain cloud-native infrastructure in AWS, ensuring high availability, scalability, and security. 2. Develop and manage infrastructure as code using Terraform, AWS CDK, Ansible, or related tools. 3. Implement comprehensive monitoring, logging, and alerting with Dynatrace, Prometheus, Grafana, CloudWatch, and the ELK Stack. 4. Support real-time data streaming and resilience using tools like Kinesis, Kafka, or similar technologies. 5. Collaborate with development teams to define and enforce SLAs, SLOs, and error budgets. 6. Automate CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, and AWS CodePipeline. 7. Investigate and resolve production incidents with a focus on root cause analysis and performance optimization. 8. Maintain configuration management and containerized environments using Docker, Kubernetes (EKS), and related platforms. 9. Participate in architecture planning, cost optimization, and continuous improvement initiatives. Minimum qualifications: 1. 5 to 6+ years of experience in SRE, DevOps, or Cloud Infrastructure roles. 2. Hands-on expertise in AWS services including EC2, ECS/EKS, Lambda, VPC, Route53, IAM, and CloudFormation. 3. Proficiency in scripting languages such as Bash or Python. 4. Practical experience with Infrastructure as Code (IaC) tools: Terraform, AWS CDK, Ansible. 5. Strong background in monitoring/observability tools such as Dynatrace, CloudWatch, Prometheus, or Grafana. 6. Experience with containerization and orchestration using Docker and Kubernetes. 7. Familiarity with Git and CI/CD tools and best practices. 8. Excellent problem-solving and incident response skills. Desirable skills: 1. Experience in multi-cloud environments (AWS, GCP, Azure). 2. Exposure to data-intensive, high-throughput systems or streaming pipelines. 3. Advanced knowledge of Dynatrace, including dashboards, anomaly detection, and AIOps. 4. Familiarity with database systems like PostgreSQL, Cassandra, or DynamoDB. 5. AWS certifications (e.g., Solutions Architect, DevOps Engineer). 6. Experience mentoring junior engineers and working in Agile/Scrum teams. 7. Understanding of security best practices and cloud networking fundamentals. 8. Contributions to open-source infrastructure or DevOps tools.
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.