Machine Learning Engineer — Distillation at Featherless AI | Torre
warning

Heads-up

The job you’re trying to post already exists in Torre:

Machine Learning Engineer — Distillation

You'll distill foundation models, driving core quality and cost efficiency for scalable AI inference.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying
location_on
Remote (anywhere)
Match
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Shared by
Emma of Torre.ai
20 days ago

Requirements and responsibilities


About the RoleWe’re looking for a Machine Learning Engineer focused on model distillation to help us build smaller, faster, and more efficient models without sacrificing quality. You’ll work at the intersection of research and production—taking cutting-edge techniques and turning them into systems that scale.This is a hands-on role with real ownership: you’ll design distillation pipelines, run large-scale experiments, and ship models used in production.What You’ll DoDesign and implement knowledge distillation pipelines (teacher–student, self-distillation, multi-teacher, etc.)Distill large foundation models into smaller, faster, and cheaper models for inferenceRun and analyze large-scale training experiments to evaluate quality, latency, and cost tradeoffsCollaborate with research to translate new distillation ideas into production-ready codeOptimize training and inference performance (memory, throughput, latency)Contribute to internal tooling, evaluation frameworks, and experiment tracking(Optional) Contribute back to open-source models, tooling, or researchWhat We’re Looking ForStrong background in machine learning or deep learningHands-on experience with model distillation (LLMs or other neural networks)Solid understanding of training dynamics, loss functions, and optimizationExperience with PyTorch (or JAX) and modern ML toolingComfort running experiments on multi-GPU or distributed setupsAbility to reason about model quality vs. performance tradeoffsPragmatic mindset: you care about shipping, not just papersNice to HaveExperience distilling LLMs or large sequence modelsExperience with inference optimization (quantization, pruning, kernels, etc.)Familiarity with evaluation for language modelsOpen-source contributions or research publicationsExperience in early-stage or fast-moving startupsWhy JoinWork on core model quality and cost efficiency—not side projectsHigh ownership and direct impact on product and roadmapSmall, senior team with strong research + engineering cultureCompetitive compensation + meaningful equityRemote-friendly, async-first environment
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.