Data Scientist (Agentic AI / LLM Evaluation) at Strektech Consulting Services | Torre

Data Scientist (Agentic AI / LLM Evaluation)

You'll define the future of AI by evaluating and enhancing multi-agent systems.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: To be defined

Provide your expected compensation while applying
location_on
Remote (anywhere)
Match
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Shared by
Emma of Torre.ai
7 days ago

Requirements and responsibilities


🔹 Key Responsibilities:Design and implement end-to-end evaluation frameworks for multi-agent AI systemsLead experimentation and A/B testing initiativesCurate and maintain high-quality ground truth datasetsIdentify reliability, accuracy, and performance gaps across agent workflowsAnalyze failure modes and edge cases in production environmentsResearch and apply advancements in Agentic AI, LLM evaluation, and multi-agent systems🔹 Technical Skills Required:PythonMachine Learning FrameworksLLM APIsLangChainLlamaIndexSemantic KernelRagas, DeepEval, LangSmith or similar evaluation toolsMLflow / Weights & Biases (W&B)CI/CD for ML WorkflowsNLP, Prompt Engineering, StatisticsExperimental Design & A/B Testing🔹 Azure Expertise:Azure MLAzure OpenAI ServiceAzure AI FoundryAzure MonitorApplication Insights🔹 Ideal Candidate:Strong background in Data Science and Generative AIExperience evaluating LLMs and Agentic AI systemsExcellent analytical and problem-solving skillsAbility to work independently in a remote environment📩 Interested candidates can share their updated resume to megha@strektech.com
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.