M

Maha Alton

About

Detail

New City, New York, United States

Timeline


work
Job

Résumé


Jobs verified_user 0% verified
  • Cresta
    Senior AI/ML Engineer
    Cresta
    Nov 2024 - Current (1 year 8 months)
    • Designed and deployed scalable ML pipelines on AWS to process 500M+ records per month, supporting both structured and unstructured data, reducing training runtime from days to hours. • Built and optimized PySpark + Apache Spark workflows for large-scale NLP model training, enabling real-time analysis of 1M+ customer interactions daily. • Developed LLM-based solutions (RAG pipelines, prompt engineering) to automatically summarize and classify 200k+ documents monthly, increasing knowledge retrieval efficiency. • Orchestrated end-to-end ML workflows using Apache Airflow, implementing automated scheduling and alerting that prevented over 150 pipeline failures per month. • Deployed ML models with FastAPI, Docker, Kubernetes, enabling sub-secon
  • I
    AI/ML Engineer
    Infuse
    May 2021 - Oct 2024 (3 years 6 months)
    • Developed and implemented machine learning models for classification, regression, and clustering using Python, Scikit-learn, and XGBoost, handling datasets of up to 50M records. • Built and maintained data pipelines using PySpark, Apache Spark, and SQL to process terabytes of structured and unstructured data. • Worked on NLP use cases using Hugging Face, BERT, and Transformer models for text analytics and sentiment analysis, processing over 100k customer reviews per month. • Orchestrated ML workflows using Apache Airflow for automated training, scheduling, and reporting, preventing 100+ workflow failures per month. • Deployed models using FastAPI and Docker, supporting batch and near real-time inference for 10k+ requests per day.
  • Klaviyo
    Junior AI Engineer
    Klaviyo
    Mar 2018 - Apr 2021 (3 years 2 months)
    • Assisted in building machine learning models for predictive analytics using Python, Scikit-learn, and Pandas. • Processed and cleaned large datasets using SQL, PySpark, and Python to support model development. • Implemented basic NLP tasks including text preprocessing, sentiment analysis, and topic modeling using Hugging Face. • Supported model evaluation, feature engineering, and hyperparameter tuning to improve performance. • Assisted in workflow automation using Apache Airflow and supported model deployment with Docker.
Education verified_user 0% verified
  • B
    Bachelor's in Computer Science
Projects (professional or personal) verified_user 0% verified
  • S
    Scalable LLM-powered conversational analytics platform
    Built a scalable LLM-powered conversational analytics platform using RAG architecture (LangChain, LlamaIndex), integrating vector databases (Pinecone, FAISS) for semantic search across 10M+ enterprise documents. Implemented document ingestion pipelines using PySpark and Apache Airflow, enabling real- time knowledge retrieval and context-aware responses via GPT-based models. Optimized embeddings and retrieval strategies to reduce response latency by 40% and improve answer accuracy. Deployed the solution using FastAPI, Docker, Kubernetes on AWS (EKS, S3), with monitoring via MLflow and Prometheus for performance tracking and observability.
  • R
    Real-time ML pipelines for fraud detection
    Built real-time ML pipelines for fraud detection on high-volume transaction streams using Apache Kafka, Spark Structured Streaming, and PySpark. Developed feature extraction and scoring pipelines to detect anomalies in near real-time, deployed classification models using TensorFlow, PyTorch, and FastAPI for low- latency inference, and implemented monitoring and alerting with MLflow and Great Expectations to track model performance and data drift.
  • S
    Scalable ML pipeline for predictive maintenance
    Implemented a scalable ML pipeline for predictive maintenance using IoT sensor data stored in AWS S3 Data Lake. Preprocessed high-volume streaming data with PySpark and Spark Structured Streaming, built LSTM and Random Forest models for early anomaly detection, and deployed solutions using Docker, Kubernetes, and AWS SageMaker for production-ready predictive maintenance applications.
  • E
    End-to-end ML pipelines for customer churn prediction
    Designed and developed end-to-end ML pipelines for customer churn prediction using Python, Pandas, Scikit- learn, XGBoost, and LightGBM. Processed large-scale datasets with PySpark and SQL, performed feature engineering, model evaluation, and hyperparameter tuning, deployed models using FastAPI, Docker, and AWS SageMaker, and created interactive dashboards with Tableau and Power BI for churn risk monitoring.