What you will do
Role Overview: As a Data Scientist focused on LLM at Gorgias, you will be crucial in developing, refining, and enabling our AI Agent, designed to address and resolve shopper requests for our merchant clients automatically. Beyond prompt engineering, you will be instrumental in scaling the AI Agent's capabilities by developing the right LLM strategies to enable our services. You will work closely with our ML Engineers, Product Managers, and Software Engineers to create, optimize, and implement solutions that enhance our AI-driven customer support solutions' efficiency, accuracy, and functionality.
Key Responsibilities:
* Engineering: Think as an Engineer to develop systems that can scale and allow for fast, reliable, and trusted iterations. Apply scientific and engineering principles to prompt design, LLM evaluation processes, and technical integrations with ecommerce platforms and tools.
* LLM pipelines: Design, develop, and refine prompts to guide the AI Agent in generating accurate and contextually relevant responses to shopper inquiries. Identify patterns and drive improvements in both prompt design and overall AI Agent performance.
* Data Analysis: Analyze large-scale data from LLM evaluations to derive insights. Identify heuristics to sample improvement and non-regression datasets. Build reliable estimators for ongoing AB tests, analyze optimal experiment stopping times, and mitigate cross AB test contamination.
* Testing & Evaluation: Conduct extensive testing of prompts to ensure they perform well in diverse scenarios. Design and implement robust methodologies for assessing LLM performance at scale. Evaluate prompts rationally via systematic testing protocols (offline train/test splits, online with AB tests).
* Collaboration: Work with product managers to understand requirements, incorporate feedback into prompt design and improvement, and integrate findings from large-scale evaluations into the AI Agent's development. Work with ML engineers to build and fine-tune in-house LLMs that outperform the state of the art.
* Documentation: Maintain comprehensive documentation for prompt design, usage, best practices, and evaluation methodologies to ensure clarity and consistency across the team.
* Continuous Improvement: Stay up to date with advancements in AI, natural language processing (NLP), and prompt engineering techniques. Incorporate cutting-edge methodologies into prompt engineering and LLM evaluation processes.
* User Experience: Analyze and interpret user interactions and large-scale performance data to identify areas for enhancement and ensure the AI Agent provides a seamless and positive experience for shoppers.
* Performance Metrics: Develop and refine metrics for evaluating prompt and LLM performance at scale, ensuring that improvements can be quantifiably measured and validated.
* Technical Enablement: Develop and scale connections between AI Agents and the e-commerce ecosystem, allowing the AI to perform a wide range of actions. Collaborate with Software Engineers to identify integration opportunities and implement scalable solutions that expand the AI Agent's capabilities across various platforms and tools used in e-commerce.
Who you are
Experience: 2 years of experience in statistics, prompt engineering, and LLM pipelines.
Technical Skills: Proficiency in programming languages such as Python/SQL, large-scale data analysis, and prompt engineering tools and frameworks.
Analytical Skills: Strong problem-solving abilities with a keen eye for detail and an understanding of how to work with AI to create a system with business impact.
Communication: Excellent written and verbal communication skills, with the ability to articulate complex concepts to both technical and non-technical stakeholders.
Education: Master’s degree in STEM (science, technology, engineering, and mathematics) or a related field.