Senior Big Data Engineer at Coding Mountain | Torre

Senior Big Data Engineer

You'll build scalable data pipelines, powering AI agents and driving innovation in big data.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: Employment

Provide your expected compensation while applying

+ Health insurance

location_on
YoungInnovations Pvt. Ltd., Lalitpur, Nepal
skeleton-gauges
You have opted out of job matches in .
To undo this, go to the 'Skills and Interests' section of your preferences.
Review preferences
Posted 11 months ago

Requirements and responsibilities


Role Overview: We are seeking a Senior Big Data Engineer with a strong background in managing structured and unstructured data pipelines, who thrives in a fast-paced AI-focused environment. You will be instrumental in building and scaling our data lake architecture, supporting a system designed to fuel intelligent AI agents for data collection, labeling, and analytical reasoning. This includes integrating vector databases and optimizing for retrieval-augmented generation (RAG) workflows deployed on AWS Bedrock and other AI stacks. Responsibilities: - Design and implement scalable ingestion pipelines for structured/unstructured data using AWS and Databricks Unity Catalog. - Build and maintain high-throughput ETL/ELT pipelines with Apache Airflow and Databricks. - Architect and manage data modeling, storage, and indexing strategies in PostgreSQL and RDS, ensuring compatibility with AI retrieval systems. - Integrate and manage vector databases to support fast semantic and embedding-based search in RAG pipelines. - Collaborate with AI engineers to ensure seamless compatibility with LangGraph and LangSmith agent systems. - Implement robust data validation, lineage, and governance systems using Unity Catalog. - Optimize performance across distributed compute environments (Databricks, EC2). - Deploy and maintain Lambda-based microservices for scalable, real-time data ingestion and enrichment. Required Skills: - 5+ years working with big data systems in production environments. - Proven expertise with Databricks, Unity Catalog, and Apache Spark. - Proficiency in Airflow, AWS stack (Lambda, EC2, RDS), and cloud-based data lake architectures. - Strong SQL and database design skills (PostgreSQL preferred). - Working knowledge of vector databases (Chroma, Pinecone, FAISS). - Solid understanding of data lifecycle management in ML/AI contexts. - Bonus: Familiarity with LangGraph, LangSmith, LangChain, or similar agent orchestration tools. Preferred Qualifications: - Experience with AI agent pipelines or large-scale ML model support. - Emphasis on data observability, security, and lineage tracking. - Hands-on with RAG architecture, including vector storage and semantic retrieval. - Exposure to AWS Bedrock and model deployment orchestration.
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.