Experienced DevOps Engineer (8+ years) at Lead Love | Torre

Experienced DevOps Engineer (8+ years)

You'll engineer robust, scalable platforms, ensuring seamless delivery and predictable growth for complex integrations.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Freelance
Recurrent
Compensation
USD90 - 115/hour
Negotiable
location_on
Remote (for United States residents)
flightsmode
Visa sponsorship: No
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Posted 4 months ago

Requirements and responsibilities


Contract (long-term), US (PST, MST). Flexible rate depending on experience Role Overview. Own the reliability and release operations for our integration work. You’ll give developers a smooth path from code to production, keep environments healthy and secure, and make system health visible so issues are found and fixed fast. Over time, you’ll tune cost/performance, harden security, and evolve our standards so we can ship integrations predictably as we scale. What you’ll do. * Own the platform lifecycle: maintain and improve our cloud setup (DigitalOcean preferred), databases (Postgres), caches/queues (Redis), and the way environments are created (Terraform/IaC). * Operate releases: keep CI/CD fast and safe (GitHub Actions), enforce health checks and rollbacks, and make deploys predictable across multiple integration workstreams. * Make reliability visible: centralize logs/metrics/traces, keep practical alerts in place, and publish clear runbooks so first responders know what to do. * Strengthen security & compliance basics: secrets handling, least-privilege access, image scanning, patches, and simple evidence for audits when needed. * Manage capacity, cost, and performance: right-size resources, set autoscaling policies, and keep cloud spend within plan. * Enable the team: answer “how do we…?” questions, write concise docs, and collaborate closely with the Fractional CTO to unblock delivery. Success Metrics. * Deployment success rate — % of deploys that complete without rollback. Target: ≥95%. * Time to restore — median time to recover from a production incident. Target: ≤30 minutes. * Operational visibility — core alerts verified monthly; runbooks exercised in a safe test. Target: 100% pass. * Cost & capacity — stay within agreed monthly cloud budget while meeting performance targets. Experience Required. * DevOps/SRE: 8+ years running cloud-hosted applications end-to-end. * IaC & containers: Terraform, Docker; reproducible environments and change control. * CI/CD: GitHub Actions (or similar) with build/test/scan/sign, blue/green or rolling deploys, and proven rollback. * Data & queues: operating managed Postgres and Redis at production scale. * Observability & ops: logging/metrics/alerts (OpenTelemetry or equivalent), incident triage, basic on-call hygiene. * Security controls: secrets management, certs, firewalls, IAM; incident response. * Multi-tenant integration patterns (per-tenant config, fairness/rate-limit). * Apigee knowledge; OpenTelemetry; experience with multi-tenant SaaS and token-bucket rate limiting. Mindset. Pragmatic and service-oriented • automates toil • documents as they go • calm in ambiguity • explains choices in plain English • raises the bar without heavy process. If you’re interested in applying, please fill out this form: https://1uqiq.share.hsforms.com/2YaQDDyr4QZKaMfyJ44QxKw
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.