Sr./Staff - Infrastructure/Site Reliability Engineer (SRE) at Oscilar | Torre

Sr./Staff - Infrastructure/Site Reliability Engineer (SRE)

You'll architect resilient cloud infrastructure, shaping the future of trust in AI-powered risk decisioning.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying
location_on
Remote (for Canada residents)
Match
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Shared by
Emma of Torre.ai
3 days ago

Requirements and responsibilities


Shape the future of trust in the age of AIAt Oscilar, we're building the most advanced AI Risk Decisioning™ Platform. Banks, fintechs, and digitally native organizations rely on us to manage their fraud, credit, and compliance risk with the power of AI. If you're passionate about solving complex problems and making the internet safer for everyone, this is your place.Why join us:Mission-driven teams: Work alongside industry veterans from Meta, Uber, Citi, and Confluent, all united by a shared goal to make the digital world safer.Ownership and impact: We believe in extreme ownership. You'll be empowered to take responsibility, move fast, and make decisions that drive our mission forward.Innovate at the cutting edge: Your work will shape how modern finance detects fraud and manages risk.About the RoleOscilar is growing fast, and so is the complexity of our systems. We’re looking for a experienced SRE to take ownership of reliability across our multi-region, cloud-native platform. You’ll have the mandate and autonomy to design, implement, and evolve systems that stay performant and resilient—through traffic spikes, dependency failures, and global deployments. You’ll be shaping how we scale, how we build observability, and how we run infrastructure that supports billions of events and large-scale data pipelines.What You’ll OwnArchitect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes).Lead initiatives to improve availability, latency, and performance at scale.Design and evolve our CI/CD pipelines to optimize for speed, safety, and repeatability.Define the metrics, alerts, and runbooks that form our observability backbone.Run chaos experiments and failure simulations to harden the platform.Mentor engineers and set best practices for SRE across the company.What You BringProven track record as a senior SRE or Infrastructure Engineer in high-scale environments.Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform).Strong programming ability in Go or Python. We use Go.Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture.Mastery of container orchestration (Kubernetes) and production debugging.Strong sense of ownership, and the judgment to balance velocity with reliability.BenefitsCompensation: Competitive salary and equity packages, including a 401k planFlexibility: Remote-first culture — work from anywhereHealth: 100% Employer covered comprehensive health, dental, and vision insurance with a top tier plan for you and your dependents (US)Balance: Unlimited PTO policyTechnical: AI First company; both Co-Founders are engineers at heart; and over 50% of the company is Engineering and ProductCulture: Family-Friendly environment; Regular team events and offsitesDevelopment: Unparalleled learning and professional development opportunitiesImpact: Making the internet safer by protecting online transactions
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.