About UsPhare Health is now part of R1 and its AI innovation engine, R37 Lab, bringing Phare’s frontier clinical reasoning technology together with one of the largest healthcare platforms in the U.S.At R37 and Phare, we are building the first AI-native Healthcare Revenue Operating System: a connected platform that reasons over full medical records, payer logic, and financial workflows to automate medical coding, billing, and follow-up.Backed by real customers, real data, and real distribution, we operate on a national scale. Our agentic AI systems already power production workflows across 95 of the top 100 U.S. health systems, processing hundreds of millions of patient encounters each year, including:180M+ Claims550M+ Patient encounters1.2B+ Workflow actions and outcomes each yearThis is startup-level ownership with enterprise-level impact. If you want to build AI that ships, scales, and measurably improves how healthcare works, this is the place to do it.The RoleWe are looking for an Applied AI Engineer/Scientist to build, evaluate, and continuously improve clinical AI agents and supervised ML Models.You will work at the intersection of software engineering, LLM systems, evaluation, model improvement, and deep healthcare workflow understanding. Your job is to turn frontier model capability into reliable production behavior: agents that read complex medical records, use the right clinical and coding context, call the right tools, produce auditable outputs, and improve from real-world failures.You will be embedded in hard healthcare problems — clinical documentation integrity, medical coding, denial prevention, appeals, revenue cycle workflows, and payer logic — and will own the loop from problem framing to agent design, evaluation, deployment, trace analysis, and ongoing improvement.The ideal candidate is a strong engineer who thinks like an applied scientist: rigorous about measurement, comfortable with ambiguity, excited by messy real-world data, and motivated by closing the gap between impressive demos and dependable production systems.What You'll DoDesign, build, and iterate on agentic AI systems for complex healthcare workflows, including documentation, coding, denial management, appeals, and revenue cycle automation.Develop long-horizon agent behavior across context construction, retrieval, tool use, memory, routing, verification, escalation, and human-in-the-loop review.Define what “good” looks like for clinical agents end-to-end, translating expert workflows into specifications, rubrics, gold standards, test cases, and clinically meaningful success criteria.Build rigorous evaluation and feedback loops using expert review, production logs, model outputs, and benchmarks to measure performance, regressions, edge cases, safety, reliability, provenance quality, and business impact.Prototype new AI capabilities from 0 → 1, then harden them into reliable, explainable, auditable production systems with clear contracts, monitoring, evidence, rationale, and performance gates.Partner with research and ML engineering teams on model selection, fine-tuning, reward modeling, distillation, synthetic data, post-training, and internal AI infrastructure, including instrumentation, experiment tracking, benchmarking, prompt/version management, and reproducible evaluation.What Makes This Role DifferentMost AI roles are either too research-heavy or too product-light. This role sits in the middle.You will not only write prompts or run experiments. You will own whether an agent actually works in production. That means understanding the workflow, designing the system, building the evals, inspecting failures, improving the agent, and proving that the improvement matters. The right person will be excited by questions like:What context does this agent need to make the right decision?How do we know the output is clinically and operationally correct?Which failures are prompt problems, retrieval problems, model problems, tool problems, or product-spec problems?How do we turn expert feedback into a better benchmark or training set?When should we use prompting, RAG, rules, fine-tuning, reward modeling, or a different architecture?How do we make agent outputs auditable enough for clinical and operational review?How do we build a data flywheel that improves the system every week? You May Be a Good Fit If YouBring 4+ years of software engineering, ML engineering, research engineering, or applied AI experience.Are highly proficient in Python and comfortable building production systems with APIs, structured data, async workflows, testing, logging, and observability.Have experience turning messy real-world workflows into structured AI problems, including classification, ranking, extraction, decisioning, LLM applications, agents, RAG, tool calling, structured outputs, prompting, or evaluation.Have built or operated evaluation systems, benchmarks, annotation workflows, experiment tracking, or regression tests for AI systems.Thrive in ambiguous, high-stakes domains: working with experts, debugging real-world failures, and turning model potential into reliable, correct, safe systems that work for users.Role LevelingWe are looking for candidates at various levels, ranging from Level 2 to StaffL2: Independently delivers a complete end-to-end project, owning design, implementation, and delivery of scoped workL3: Leads delivery of larger projects, handling increased technical complexity and ambiguity, providing light guidance to L2s on shared workBenefitsTop-of-market compensation (salary + equity)Flexible PTOComprehensive health benefits401(k) matchingInspiring, brilliant, mission-driven teammatesHiring FlowIntro call - your background & our mission alignmentTechnical deep-dives - pseudo-coding exercise and systems design (not Leetcode)Final in-person interview at one of our hubs (SF, NYC, Austin, or Chicago; travel arranged)ReferencesOfferInterview Logistics NoticeAs part of our hiring process, selected candidates will participate in an in-person interview. Candidates located near one of our talent hubs—San Francisco, New York, Austin, or Chicago—will be scheduled to meet with team members in those locations. For candidates residing outside these areas, we will arrange travel to a hub for the interview. Travel accommodation will be provided as needed. We are committed to providing equal employment opportunities and ensuring a fair and inclusive experience for all applicants.For this US-based position, the base pay range is $140,000.00-$300,000 per year. Individual pay is determined by role, level, location, job-related skills, experience, and relevant education or training.This job is eligible to participate in our annual bonus plan.

Heads-up

Applied AI Scientist, Clinical AI Agents

Emma

Requirements and responsibilities

Skills wanted:

Language(s) required:

Lee

Martin

Tymor

Lee

Martin

Tymor

About Phare Health and R1 RCM:

www.phare.health

Admin access needed

Payment confirmed

A member of the Torre team will contact you shortly