S

S Vikas

About

Detail

New York, United States

Timeline


work
Job

Résumé


Jobs verified_user 0% verified
  • Artera
    AI Lead Engineer
    Artera
    Jan 2023 - Jan 2025 (2 years 1 month)
    • Led the end-to-end development of a multi-agentic conversational AI pipeline for Artera, a healthcare platform streamlining patient scheduling via automated calling tools. • Architected a low-latency Orchestration Layer to manage specialized Al agents (Patient Search, Scheduling, Cancellations) by accurately identifying and routing user intent in real-time. • Reduced system latency by over 60% (from 1800ms to <700ms) by implementing streaming, chunking, and asynchronous calling strategies. • Engineered an in-house Voice-to-Text (STT) solution integrated with Twilio and WebSockets to replace high-latency third-party tools, ensuring seamless real-time communication. • Developed an "LLM-as-a-Judge" evaluation framework to automate quality as
  • Meta
    Senior Software Engineer (AI/ML)
    Meta
    Jan 2021 - Jan 2023 (2 years 1 month)
    • Designed and scaled LLM inference services supporting multi-million daily request volume, enabling Al features across Meta's core consumer products with high availability and low latency. • Developed model-agnostic Al orchestration infrastructure, enabling rapid experimentation and safe rollout of GenAl features, unlocking faster iteration cycles and consistent production standards. • Introduced cost-aware inference routing and fallback strategies, optimizing latency-quality trade-offs and reducing overall compute spend by double-digit percentages. • Collaborated with product and ML teams to ship Al-powered experiences globally, balancing experimentation velocity with reliability, safety, and platform constraints. • Optimized inference pi
  • TestGorilla
    Senior Software Developer
    TestGorilla
    Jan 2019 - Jan 2021 (2 years 1 month)
    • Designed and scaled machine learning inference services supporting multi-million daily request volumes, enabling LSTM, CNN, and regression-based models with high availability and low latency. • Developed model-agnostic infrastructure to support training, testing, and deployment of deep learning and statistical models, enabling rapid experimentation and safe production rollouts. • Implemented cost-aware inference routing and fallback strategies for multiple model types, optimizing latency-accuracy trade-offs and reducing overall compute costs by double-digit percentages. • Collaborated with product and ML teams to deploy time-series (LSTM), computer vision (CNN), and predictive regression models globally, balancing experimentation speed wi
  • C
    Full Stack Developer
    CARMA
    Jan 2017 - Jan 2019 (2 years 1 month)
    • Developed end-to-end web applications including frontend UI and backend services. • Built responsive user interfaces and robust server-side logic. • Integrated third-party APIs and managed databases. • Collaborated with designers and stakeholders to translate requirements into working features. • Ensured application security, performance, and maintainability.