Principal Engineer - Platform Architecture & Core Reliability at Quizlet | Torre

Principal Engineer - Platform Architecture & Core Reliability

You'll engineer AI-powered learning tools to unlock global potential.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Compensation
USD260k - 320k/year
location_on
San Francisco, CA, USA
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Posted 6 months ago

Requirements and responsibilities


About Quizlet:At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way. Our $1B+ learning platform serves tens of millions of students every month, including two-thirds of U.S. high schoolers and half of U.S. college students, powering over 2 billion learning interactions monthly.We blend cognitive science with machine learning to personalize and enhance the learning experience for students, professionals, and lifelong learners alike. We’re energized by the potential to power more learners through multiple approaches and various tools.Let’s Build the Future of LearningJoin us to design and deliver AI-powered learning tools that scale across the world and unlock human potential.About the Role:We're hiring a Principal Engineer to lead critical architectural decisions that establish industry-leading standards for reliability and operational excellence. This is a high-leverage, hands-on role focused on optimizing performance, driving engineering velocity, and leading systemic architectural change. The role reports to the Senior Director of Technical Infrastructure.We’re happy to share that this is an onsite position in our San Francisco office. To help foster team collaboration, we require that employees be in the office a minimum of three days per week: Monday, Wednesday, and Thursday and as needed by your manager or the company. We believe that this working environment facilitates increased work efficiency, team partnership, and supports growth as an employee and organization.In this role, you will:Platform Reliability (SLO Focus): Lead the strategy and implementation necessary to achieve and maintain our 99.95% availability target. This includes evolving our multi-region deployment strategy and optimizing for resilience under sustained, high-volume trafficData Backbone Scaling: Define the architectural approach for scaling our core data systems, optimizing performance across Cloud Spanner, PlanetScale MySQL, and BigQueryCompute & Service Mesh: Drive performance and efficiency improvements across our managed compute environment, specifically optimizing Kubernetes (GKE) clusters and managing the performance and operational complexity of IstioDeveloper Velocity & CI/CD: Architect high-leverage internal platforms, designing the pipelines across tools like GitHub Actions, CircleCI, and ArgoCD, and driving organizational influence to standardize high-velocity, safe deployment practicesIncident & Learning Culture: Drive reliability change across the engineering organization by leveraging deep-dive analysis of incidents (Jeli) and proactive monitoring (Datadog), improving operational practices, and influencing architectural design decisionsPerformance & Cost Engineering: Act as a technical owner for the cost-per-request metric, identifying and implementing architectural efficiencies (caching, connection pooling, resource utilization) that scale down infrastructure spend while maintaining service speed
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.