Senior Software Platform Engineer at TetraScience | Torre
warning

Heads-up

The job you’re trying to post already exists in Torre:

Senior Software Platform Engineer

You'll architect and scale AI infrastructure to accelerate scientific discovery and development.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying
location_on
Remote (for United States residents)
Match
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Shared by
Emma of Torre.ai
about 1 month ago

Requirements and responsibilities


About TetraScienceTetraScience is the Scientific Data and AI Company building Tetra OS, the operating system for scientific intelligence. We help the world’s leading life sciences firms turn fragmented scientific data into AI-native assets and scientific workflows that accelerate discovery, development, and manufacturing. TetraScience’s growing ecosystem of strategic partners includes NVIDIA, Databricks, Thermo Fisher Scientific, Snowflake, Google, and Microsoft.In connection with your candidacy, you will be asked to carefully review “The Tetra Way,” authored by our CEO, Patrick Grady; it is impossible to overstate the importance of this document, and you should take it literally as you decide whether our mission, culture, and expectations are right for you.What You will DoWe’re looking for a Senior AI Platform Engineer to help design, build, and scale our AI and data infrastructure. In this role, you’ll focus on architecting and maintaining cloud-based MLOps pipelines to enable scalable, reliable, and production-grade AI/ML workflows, working closely with AI engineers, data engineers, and platform teams. Your expertise in building and operating modern cloud-native infrastructure will help enable world-class AI capabilities across the organization.If you are passionate about building robust AI infrastructure, enabling rapid experimentation, and supporting production-scale AI workloads, we’d love to talk to you.Design, implement, and maintain cloud-native platform to support AI and data workloads, with a focus on AI and data platforms such as Databricks and AWS Bedrock.Build and manage scalable data pipelines to ingest, transform, and serve data for ML and analytics.Develop infrastructure-as-code using tools like Cloudformation, AWS CDK to ensure repeatable and secure deployments.Collaborate with AI engineers, data engineers, and platform teams to improve the performance, reliability, and cost-efficiency of AI models in production.Drive best practices for observability, including monitoring, alerting, and logging for AI platforms.Contribute to the design and evolution of our AI platform to support new ML frameworks, workflows, and data types.Stay current with new tools and technologies to recommend improvements to architecture and operations.Integrate AI models and large language models (LLMs) into production systems to enable use cases using architectures like retrieval-augmented generation (RAG).Requirements7+ years of professional experience in software engineering and infrastructure engineering.Extensive experience building and maintaining AI/ML infrastructure in production, including model, deployment, and lifecycle management.Strong knowledge of AWS and infrastructure-as-code frameworks, ideally with CDK.Expert-level coding skills in TypeScript and Python building robust APIs and backend services.Production-level experience with Databricks MLFlow, including model registration, versioning, asset bundles, and model serving workflows.Expert level understanding of containerization (Docker), and hands on experience with CI/CD pipelines, orchestration tools (e.g., ECS) is a plus.Proven ability to design reliable, secure, and scalable infrastructure for both real-time and batch ML workloads.Ability to articulate ideas clearly, present findings persuasively, and build rapport with clients and team members.Strong collaboration skills and the ability to partner effectively with cross-functional teams.Nice to HaveFamiliarity with emerging LLM frameworks such as DSPy for advanced prompt orchestration and programmatic LLM pipelines.Understanding of LLM cost monitoring, latency optimization, and usage analytics in production environments.Knowledge of vector databases / embeddings stores (e.g., OpenSearch) to support semantic search and RAG.Benefits100% employer-paid benefits for all eligible employees and immediate family membersUnlimited paid time off (PTO)401KFlexible working arrangements - Remote workCompany paid Life Insurance, LTD/STDA culture of continuous improvement where you can grow your career and get coachingWe are not currently providing visa sponsorship for this position.
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.