Senior AI QA Analyst at Blend360 | Torre

Senior AI QA Analyst

You will define the future of AI quality, ensuring robust and ethical agentic systems.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying
location_on
Montevideo, Montevideo Department, Uruguay
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Posted 5 months ago

Requirements and responsibilities


Company DescriptionAre you looking for an opportunity to jump-start your career in a thriving tech industry? Join us at Blend grow our business and execute our mission to help organizations leverage data and technology to make better decisions. With over 10 years of experience in the field of big data, top US customers such as Mastercard, Tripadvisor, and Roku choose us to develop innovative products.We are seeking a Senior AI QA Analyst to contribute to our next level of growth and expansion.What is this position about?We are expanding our Quality function to meet the realities of modern, agentic AI systems. This role combines traditional QA practices with advanced evaluation of AI models, LLMs, RAG pipelines, and agent-based workflows in production environments. The Senior AI QA Analyst will define and enforce the quality bar across AI-driven features by analyzing model behavior, data quality, grounding, fairness, robustness, and live performance.This position is open to candidates located in LATAM (Uruguay, Chile, Colombia, or Argentina) and requires strong QA expertise, foundational knowledge of data science workflows, and advanced English.Job DescriptionImplement AI quality strategies for LLM, RAG, and agentic workflows—from test design through post-launch monitoring.Design and execute test plans combining both traditional QA and AI-focused evaluation for complex, non-deterministic systems.Validate data quality through schema checks, drift/shift detection, and ground-truth audits; drive remediation with DS and Engineering.Build adversarial and synthetic test sets to stress-test reasoning chains, tool-usage, hallucinations, jailbreaks, and prompt-injection risks.Develop and maintain evaluation pipelines to measure grounding, robustness, calibration, cost, latency, task-completion, and system reliability.Instrument production systems for continuous monitoring, including risk taxonomies, alerting criteria, escalation pathways, and quality guardrails.Establish and refine guardrails for AI features (content moderation, input/output validation, tool-call limits, loop-prevention checks, and more).Create and maintain comprehensive test cases, user scenarios, and documentation for both AI behaviors and traditional software components.Collaborate with Data Scientists and AI Engineers to interpret model performance, evaluation metrics, and overall system behavior.Report quality health through clear dashboards, documentation, and incident reviews while influencing roadmaps through evidence-based recommendations.Contribute to continuous improvement by codifying templates, pipelines, and evaluation frameworks to scale QA practices across teams.Qualifications4–5+ years of experience in QA, including manual and automated testing in software environments.At least 1 year of experience working with or supporting Data Science, ML, AI, or analytics workflows.Basic Python knowledge with familiarity using common DS libraries (NumPy, Pandas) and Pytest for unit testing.Practical understanding of AI/LLM evaluation, including prompt/test design, evaluation rubrics, and interpretation of model behavior.Experience assessing data quality, detecting drift or inconsistencies, and performing ground-truth verification.Ability to design synthetic/adversarial test sets for AI-driven applications.Basic understanding of cloud environments and modern data/AI pipelines.Familiarity with QA automation concepts and integration/regression/performance testing.Strong ability to translate ambiguous model outputs into testable hypotheses, measurable metrics, and clear action steps.Experience working with Jira and Confluence for test planning, documentation, and workflow management.Basic SQL and Excel skills to support data validation and reporting.Excellent communication, analytical thinking, and cross-functional collaboration skills.What about languages? You will need excellent written and verbal English for clear and effective communication with the team.How much experience must I have? We're looking for someone with 4+ years of experience in similar roles.Additional InformationOur perks and benefits: 🍔 Every day lunches! (headquarters): Vegetarian, vegan, gluten and sugar free options.Gourmet meals every Friday with our on-site chef!⚖️ Flexible working options to help you strike the right balance.👨🏽‍💻 All the equipment you need to harness your talent (Macbook and accessories).☕ Snacks and beverages available everyday (headquarters).🎮 After office events, football, tennis and game nights (headquarters).Everyone is welcome to join our football league every Wednesday’s and Friday’s.Challenge your teammates to a pool game and win the office’s trophy!Tennis courts available for friendly matches.You are not a sports person? Don’t worry, we also have chess championships, game and music nights for you to join!📚 Learning opportunities: AWS Certifications (we are AWS Partners).Study plans, courses and other certifications.English Lessons.Learn from your teammates on our Tech Tuesdays!👩‍🏫 Mentoring and Development opportunities to shape your career path.🎁 Anniversary and birthday gifts.🏡 Great location and even greater teammates!So what are the next steps? Our team is eager to learn about you! Send us your resume or LinkedIn profile below and we’ll explore working together!
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.