AI Engineer at BIO | Torre
warning

Heads-up

The job you’re trying to post already exists in Torre:

AI Engineer

You'll build AI agent systems that accelerate real-world therapeutics and empower scientific discovery.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying
location_on
Remote (anywhere)
Match
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Shared by
Emma of Torre.ai
2 months ago

Requirements and responsibilities


About BioBio is a decentralized science protocol that helps launch and grow AI-driven biotech research. It enables scientists to raise funds, create value from their work, and distribute that value directly to their communities. Since 2023, Bio has directed over $50m to global researchers, offering an alternative to traditional pharma funding. Backed by investors like Binance Labs, Northpond Ventures, and Animoca Brands, Bio accelerates real-world therapeutics across longevity, brain health, fertility, psychedelic science, and more.Why this role existsAs a Member of Technical Staff on the AI Agents team, you’ll design, build, and scale the core agent systems that power Bio Protocol’s products. You’ll work closely with full-stack engineers and scientist-evaluators to create agents that can plan, use tools, and reason safely. This role offers the opportunity to shape the foundation of how AI collaborates with human scientists—combining technical depth with real-world scientific impact. While technical skills matter, we believe drive and cultural fit matter most.What you’ll doBuild agent capabilities for planning, tool use, memory, and context management, and ship them into production.Integrate agents with internal and external tools and data sources (retrieval systems, structured datasets, lab/biomed APIs, spreadsheets, search), with robust schemas and safeguards.Develop quality and evaluation systems, including unit, regression, and scenario/benchmark tests, telemetry, and automated scoring.Collaborate with scientists to analyze failure modes and improve performance.Partner with the knowledge and ontology team to ensure outputs are source-traceable and compliant with provenance standards.Implement safety measures, guardrails, and sandboxed execution for risky operations.Optimize performance and reliability through profiling, idempotency, retries, rate limiting, and uptime management.Instrument data pipelines for supervised fine-tuning and reinforcement learning when needed.Contribute to the agent platform, including services, APIs, orchestration, CI/CD, and observability.Example projects (first 90 days)Deliver a multi-tool agent capable of executing long-horizon scientific tasks with memory and self-correction, supported by regression tests and telemetry.Implement automated citation enforcement, including source checking, freshness validation, and provenance display in the UI.Build an evaluation dashboard tracking competency pass rates, latency, and failure modes.Success metrics include improved pass rates and reduced critical error rates across core scientific competencies; performance against SLOs for latency, task success, tool-call reliability, and uptime; increased coverage of regression and evaluation scenarios; and broader adoption of the agent platform by internal teams.QualificationsExperience building production software in Python and/or TypeScript, with strong systems and API design skills (FastAPI, gRPC, GraphQL, or similar).Proven experience shipping LLM applications or agentic systems (tool use/function calling, retrieval/RAG, structured outputs, evaluation, or observability).Familiarity with agent/orchestration frameworks (e.g., LangChain, LangGraph, AutoGen, CrewAI, MCP) and vector databases (FAISS, Weaviate, Pinecone).Experience with cloud infrastructure and containers (AWS, GCP, or Azure), Docker/Kubernetes/Terraform, CI/CD, and production telemetry.Ability to translate research prototypes into robust, scalable systems.Nice to haveExperience with fine-tuning and reinforcement learning (RL, RLAIF, RLHF), including reward design and offline evaluation.Familiarity with benchmarks and evaluations such as SWE-Bench, OS-World, or tau-bench.Knowledge of retrieval and knowledge systems, including schema and ontology design, entity modeling, and provenance tracking.Background in agentic system safety and security (sandboxing, isolation, permissions, auditability).Exposure to life sciences or scientific computing and collaboration with domain experts.How we workEvidence-first: every output is grounded and source-verifiable.Tight feedback loops: weekly quality reviews with scientists to ship, measure, and improve.Platform mindset: we create safe, reusable systems that empower others to build new agent capabilities.Tools you’ll usePython, TypeScript, FastAPI/gRPC, Postgres, Redis/queues, Docker, Kubernetes, Terraform, cloud LLM APIs, open-weight models, vector databases, telemetry and observability tools, and internal agent/evaluation systems.
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.