OverviewTRM Labs provides blockchain analytics and AI solutions to help law enforcement and national security agencies, financial institutions, and cryptocurrency businesses detect, investigate, and disrupt crypto-related fraud and financial crime. TRM’s blockchain intelligence and AI platforms include solutions to trace the source and destination of funds, identify illicit activity, build cases, and construct an operating picture of threats. TRM is trusted by leading agencies and businesses worldwide who rely on TRM to enable a safer, more secure world for all.The AI Engineering Team is chartered with enabling next-generation AI applications, with a special focus on Large Language Models (LLMs) and agentic systems. Our mission is to build robust pipelines, high-performance infrastructure, and operational tooling that allow AI systems to be deployed with speed, safety, and scale.We manage petabyte-scale pipelines, serve models with millisecond-level latency, and provide the observability and governance needed to make AI production-ready. We’re also deeply involved in evaluating and integrating cutting-edge tools in the LLM and agent space — including open-source stacks, vector databases, evaluation frameworks, and orchestration tools that unlock TRM’s ability to innovate faster than the market.As a Staff MLOps Engineer with a focus in LLMOps, you’ll be at the core of building and scaling the technical infrastructure for AI/ML systems.What You Will DoBuild reusable CI/CD workflows for model training, evaluation, and deployment — integrating Langfuse, GitHub Actions, and experiment tracking, etc.Automate model versioning, approval workflows, and compliance checks across environments.Build out a modular and scalable AI infrastructure stack — including vector databases, feature stores, model registries, and observability tooling.Partner with engineering and data science to embed AI models and agents into real-time applications and workflows.Continuously evaluate and integrate state-of-the-art AI tools (e.g. LangChain, LlamaIndex, vLLM, MLflow, BentoML, etc.).Drive AI reliability and governance, enabling experimentation while ensuring compliance, security, and uptime.Build and enhance AI/ML Model PerformanceEnsure data accuracy, consistency and reliability, leading to better model training and inferencingDeploy infrastructure to support offline and online evaluation of LLMs and agents — including regression testing, cost monitoring, and human-in-the-loop workflows.Enable researchers to iterate quickly by providing sandboxes, dashboards, and reproducible environments.What We’re Looking ForWrite high-quality, maintainable software — primarily in Python, but we value engineering ability over language familiarity.Have a strong background in scalable infrastructure, including:Containerization and orchestration (e.g. Docker, Kubernetes)Infrastructure-as-code and deployment (e.g. Terraform, CI/CD pipelines)Monitoring and logging frameworks (e.g. Datadog, Prometheus, OpenTelemetry)Understand and implement ML Ops best practices, including:Model versioning and rollback strategiesAutomated evaluation and drift detectionScalable model and agent serving infrastructure (e.g. vLLM, Triton, BentoML)Deploy and maintain LLM and agentic workflows in production, including:Monitoring cost, latency, and performanceCapturing traces for analysis and debuggingOptimizing prompt/response flows with real-time data accessDemonstrate strong ownership and pragmatism, balancing infrastructure elegance with iterative delivery and measurable impact.Learn about TRM Speed in this position:Rapid Issue Resolution. TRM Engineers identify and resolve critical onsite issues in minutes to hours, not weeks. We create virtual war rooms, implement fixes, and share lessons with both customer stakeholders and internal teams within 48 hours.Navigating Bureaucracy. We anticipate and address procedural hurdles, build trust with key stakeholders, and find alternative pathways to approvals. This keeps projects moving even in complex environments.Efficient Knowledge Transfer. Engineers document and share updates in real time, ensuring the entire team—onsite and remote—has full visibility into plans, blockers, and resolutions. Knowledge sharing sessions and clear documentation reduce friction and accelerate delivery.About TRM's Engineering Levels:Engineer: Responsible for helping to define project milestones and executing small decision decisions independently with the appropriate tradeoffs between simplicity, readability, and performance. Provides mentorship to junior engineers, and enhances operational excellence through tech debt reduction and knowledge sharing.Senior Engineer: Successfully designs and documents system improvements and features for an OKR/project from the ground up. Consistently delivers efficient and reusable systems, optimizes team throughput with appropriate tradeoffs, mentors team members, and enhances cross-team collaboration through documentation and knowledge sharing.Staff Engineer: Drives scoping and execution of one or more OKRs/projects that impact multiple teams. Partners with stakeholders to set the team vision and technical roadmaps for one or more products. Is a role model and mentor to the entire engineering organization. Ensures system health and quality with operational reviews, testing strategies, and monitoring rigor.CompensationIndividual pay is determined by skills, qualifications, experience, and location. The compensation details listed in this posting reflect the US base salary only.The estimated base salary range for this role is $200,000 - $275,000.Additionally, this role may be eligible to participate in TRM’s equity plan.Please note – we factor in the different costs for geographies outside the United States.Life at TRMWe are building a safer world. That promise shows up in how we work every day.TRM moves quickly. We are a high velocity, high ownership team that expects clarity, follow-through, and impact. People who thrive here are energized by hard problems, experimentation, and continuous feedback. If something takes months elsewhere, it will ship here in days.At TRM, you should expect priorities and targets to change quickly as we experiment and iterate; work that often requires operating with a high degree of ambiguity; a high level of personal ownership and accountability; close collaboration across teams and functions; frequent, high-touch communication; creative problem solving and out-of-the-box thinking; and a pace that rewards urgency, adaptability, and outcomes.AI Fluency at TRMAI fluency is a baseline expectation at TRM. We expect every team member to use AI to accelerate and reimagine their craft, not just automate surface tasks.AI fluency means you are among the top 10 percent of operators in your function in how you apply AI to: Accelerate repeatable workflows; Structure and solve problems; Improve output quality; Increase speed and leverage. You will be evaluated on applied AI fluency during the interview process.Leadership PrinciplesImpact-Oriented Trailblazer: We put customers first and move with speed, focus, and adaptability. We treat every plan like an experiment – test, ship, measure, and iterate quickly.Master Craftsperson: We care deeply about our craft. We balance speed with high standards, own outcomes end‑to‑end, and invest in getting better everyday.Inspiring Colleague: We add clarity and energy, not noise. We bring humility, candor, and a one‑team mindset — giving and receiving feedback to make the team stronger.Join our MissionWe are looking for individuals who want their work to matter, who experiment with speed and rigor, and who take pride in building a safer world for billions of people. If you’re excited by TRM’s mission but don’t check every box, we encourage you to apply — we hire for slope, judgment, and the will to learn fast.TRM is a Series C company with $220M in total funding. Headquartered in San Francisco, TRM operates as a distributed-first company with hubs in Los Angeles, San Francisco, New York, Washington D.C., London, and Singapore.