Mask Box – Compliant Multilingual PII Masking System
NA
Jan 2026 - Apr 2026 (4 months)
• Built a zero-cost PII masking system saving $60K/year in vendor costs, supporting 20+ languages using a 6-layer detection strategy combining XLM-RoBERTa Large (NER), Ollama 3.1 8B, regex patterns, and RAG (ChromaDB + Mem0) for contextual entity resolution
• Engineered async chunking pipeline on FastAPI capable of masking 1,000 records in 3mins with 10+ concurrent users without latency degradation, with a real-time masking QA management
• Designed user-configurable exception handling allowing teams to whitelist specific entities, making the system adaptable across different compliance requirements without code changes