Job Summary:
- We are seeking a highly skilled and versatile professional to join our team as a Cloud Operations Engineer.
- This role will be responsible for managing, monitoring, securing, and optimizing our cloud infrastructure and operational environment.
- The ideal candidate will have a strong background in cloud operations, systems administration, infrastructure automation, security, monitoring, and observability.
- The Cloud Operations Engineer will play a critical role in ensuring the reliability, performance, scalability, security, and operational health of FreightPOP's cloud-based platform.
- This includes ownership of monitoring and observability platforms, proactive incident detection and response, infrastructure lifecycle management, vulnerability remediation, and maintaining stable production and QA environments that support business continuity and growth.
Key Responsibilities:
- Manage and optimize cloud infrastructure primarily within AWS.
- Design, implement, and maintain scalable, secure, and highly available cloud solutions.
- Provision, configure, and maintain cloud servers, services, container platforms, and network infrastructure.
- Ensure the security, availability, and integrity of company systems, applications, and data.
- Apply updates, security patches, and vulnerability remediation activities across cloud environments.
- Implement and maintain monitoring, logging, alerting, and observability solutions to ensure platform health and operational visibility.
- Monitor cloud resource utilization, perform capacity planning, and optimize infrastructure costs.
- Troubleshoot infrastructure, networking, performance, availability issues across cloud environments, manage incidents.
- Manage identity, access control, and security policies across cloud platforms.
- Implement and maintain infrastructure automation and Infrastructure as Code practices (e.g., Terraform).
- Collaborate with Engineering, Product, and Support teams to improve platform reliability, performance, and operational efficiency.
- Maintain operational documentation and support incident response and problem resolution activities.
Qualifications:
- Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent experience).
- 3+ years of experience in Cloud Operations, Systems Administration, Infrastructure Engineering, DevOps, or a related role.
- 1+ years experience managing cloud infrastructure in AWS.
- 1+ years experience managing Windows Servers.
- Experience with infrastructure automation and Infrastructure as Code tools such as Terraform.
- Experience with container technologies such as Docker and Kubernetes.
- Strong troubleshooting, problem-solving, and communication skills.
- Ability to manage multiple priorities in a fast-paced SaaS environment.