Machine Learning Operations Engineer at K1X | Torre
warning

Heads-up

The job you’re trying to post already exists in Torre:

Machine Learning Operations Engineer

You'll build the robust ML platform powering AI-driven automation for a modern K-1 experience.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying
location_on
Remote (for United States residents)
Match
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Shared by
Emma of Torre.ai
about 2 months ago

Requirements and responsibilities


DescriptionLocation: Fully RemotePreferred Locations: Midwest-based; Indianapolis, IN or IL, Chicagoland Area preferred Who We Are We are K1X. Our platform powers a modern, all-digital K-1 experience by replacing legacy workflows with scalable software and AI-driven automation. As we expand our machine learning capabilities, we are investing in a robust ML platform that enables production-grade model development, deployment, and monitoring across our products.  About your Role We’re seeking an experienced Machine Learning Operations (MLOps) Engineer to join our team and build the infrastructure that powers AI and machine learning at K1X. This is a hands-on role focused on designing scalable systems, pipelines, and tooling that enable our Machine Learning Engineers to efficiently train, deploy, and operate models in production. You’ll work at the intersection of software engineering, DevOps, and machine learning—owning the reliability, scalability, and performance of our ML platform.   Your Responsibilities Design and build scalable ML infrastructure to support model training, evaluation, and deployment. Develop and maintain containerized environments using Docker and Kubernetes. Build and manage distributed training pipelines and orchestration workflows. Implement and maintain ML lifecycle tooling such as MLflow for experiment tracking and reproducibility. Own production inference systems, including NVIDIA Triton Inference Server. Design and operate low-latency, high-availability model serving architectures. Implement CI/CD pipelines for ML deployment, versioning, and rollback strategies. Build and maintain data pipelines integrated with Snowflake and related data systems. Implement monitoring, logging, and alerting for model performance, drift detection, and system health.Partner with ML Engineers to improve developer experience and accelerate delivery.  RequirementsWho You Are: Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent experience. 5+ years of experience in software engineering, DevOps, or MLOps roles. Strong proficiency in Python and experience building production-grade systems. Hands-on experience with Docker, Kubernetes, and distributed systems. Experience building and maintaining CI/CD pipelines. Familiarity with ML lifecycle tools such as MLflow or similar. Experience working with cloud-based data platforms such as Snowflake. Strong understanding of system design, APIs, and microservices architectures. Proven debugging and troubleshooting ability across distributed systems.  It's Truly a Match If You Have: Experience managing inference infrastructure such as NVIDIA Triton Inference Server. Experience building large-scale training infrastructure including GPU workloads and distributed training. Familiarity with feature stores, data versioning, and experiment tracking systems. Experience supporting NLP or document processing pipelines. Exposure to observability tools such as Prometheus, Grafana, or similar. Experience working in SaaS environments with high availability, productivity, and performance requirements. A strong bias toward automation, scalability, and continuous improvement. A collaborative mindset and ability to work cross-functionally with engineering and data teams.  BenefitsUnlimited Vacation Policy + Sick TimeFully Remote OpportunityBenefits/401KGrowing Startup CultureUnlimited Vacation Policy + Sick Time + HolidaysPaid Parental LeaveFully Remote OpportunityHealthcare Benefits and 401KGrowing Startup Culture
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.