About LiveKitLiveKit is building the infrastructure layer for the agentic era of computing. Our platform gives developers everything they need to build, test, deploy, scale, and observe AI agents in production. Founded in 2021, LiveKit powers voice and agentic AI applications for OpenAI, Salesforce, Spotify, Meta, and tens of thousands of other developers, collectively facilitating billions of calls each year.About This RoleWe're hiring a Senior Infrastructure Engineer to join the small team that owns the cloud foundation all of LiveKit's real-time products run on — reliability, networking, and shared platform primitives like CockroachDB, NATS, and Nebula.LiveKit is at 3B+ calls per year, and the reliability surface area is growing faster than team capacity. There are three buckets of work on this team: (1) Product SRE — jumping directly into the product codebase and implementing reliability goals within it: load balancing, load shedding, instrumentation, scalability, efficiency. (2) Self-service platform tooling — building frameworks and libraries so product teams can self-serve reliability without Infra as a bottleneck. (3) Reactive work — on-call, urgent feature requests, reliability debt. The goal is to minimize #3 through #1 and #2.There is still meaningful reliability debt to work through when you join. We're being honest about that up front. The upside: small team, high ownership, and real influence over how a global real-time platform scales.You'll Thrive Here If You:obsess over crafting code that is fast, reliable, and practical for the problemare known as the go-to person for tackling tough technical problemswork hard and can build and ship fastcan clearly explain complex technical concepts to other engineersare a fast learner, frequently picking up new languages and toolsThe best way to impress us is with thoughtful Issues and/or PRs on our GitHub repos.What You'll DoRamp on LiveKit's global architecture — CockroachDB, NATS, Nebula, Kubernetes — and map where reliability debt livesShip product SRE work directly in the product codebase: load balancing, load shedding, instrumentation, scalability, efficiencyBuild and extend common tooling so product teams can self-service reliability without Infra as a bottleneckParticipate in the on-call rotation and help resolve recurring reliability patternsBring informed systems opinions that improve how the team makes architectural decisionsWho You AreStrong Go fluency — you write production Go, not just read itKubernetes at depth (internals + networking), not just opsComfortable working directly in product codebases, not only around themDebug by looking behind the curtain, not just at dashboardsTrack record of building or operating production distributed systems at scaleNice to HaveExperience with CockroachDB, NATS, or similar distributed systemsGoogle SRE or equivalent high-scale backgroundNebula, WireGuard, or similar overlay networking experienceOpen source contributions, especially on infra or developer toolsExperience in realtime, audio/video, or low-latency systemsOur Commitment to YouAn opportunity to build something truly impactful to the worldContribute to open source alongside world-class engineersCompetitive salary and equity packageHealth, dental, and vision benefitsFlexible vacation policy