🔹 Key Responsibilities:Design and implement end-to-end evaluation frameworks for multi-agent AI systemsLead experimentation and A/B testing initiativesCurate and maintain high-quality ground truth datasetsIdentify reliability, accuracy, and performance gaps across agent workflowsAnalyze failure modes and edge cases in production environmentsResearch and apply advancements in Agentic AI, LLM evaluation, and multi-agent systems🔹 Technical Skills Required:PythonMachine Learning FrameworksLLM APIsLangChainLlamaIndexSemantic KernelRagas, DeepEval, LangSmith or similar evaluation toolsMLflow / Weights & Biases (W&B)CI/CD for ML WorkflowsNLP, Prompt Engineering, StatisticsExperimental Design & A/B Testing🔹 Azure Expertise:Azure MLAzure OpenAI ServiceAzure AI FoundryAzure MonitorApplication Insights🔹 Ideal Candidate:Strong background in Data Science and Generative AIExperience evaluating LLMs and Agentic AI systemsExcellent analytical and problem-solving skillsAbility to work independently in a remote environment📩 Interested candidates can share their updated resume to megha@strektech.com

Data Scientist (Agentic AI / LLM Evaluation)

Emma

Requirements and responsibilities

Skills wanted:

Language(s) required:

About Strektech Consulting Services:

strektech.com/

Admin access needed

Payment confirmed

A member of the Torre team will contact you shortly