Senior Data Engineer - AWS & RAG Pipelines at Jalasoft | Torre
warning

Heads-up

The job you’re trying to post already exists in Torre:

Senior Data Engineer - AWS & RAG Pipelines

You'll architect cloud data infrastructure, powering AI initiatives and autonomous agents with real-time data.
Emma highlights
This highlight was written by Emma’s AI. Ask Emma to edit it.
Full-time

Legal agreement: To be defined

Compensation is to be agreed upon.
location_on
Remote (for Colombia residents)
Remote (for Peru residents)
Remote (for Brazil residents)
Remote (for Dominican Republic residents)
Shared by
Emma of Torre.ai
2 days ago

Requirements and responsibilities


We're looking for a Senior Data Engineer to design and operate the cloud data infrastructure powering our AI initiatives. You'll architect production-scale data lakes on AWS, build real-time ingestion and observability pipelines, and own the vector search and embedding layers that feed our RAG systems and autonomous agents.DescriptionWe're looking for a Senior Data Engineer to design and operate the cloud data infrastructure powering our AI initiatives. You'll architect production-scale data lakes on AWS, build real-time ingestion and observability pipelines, and own the vector search and embedding layers that feed our RAG systems and autonomous agents.RequirementsMust-HaveOverall Experience: 7+ years in Data Engineering, Distributed Systems, or Data ArchitectureAWS & Infrastructure: 4+ years architecting production-scale data lakes, storage tiers, and event streamingAI/LLM Pipelines: 2+ years building RAG systems, managing embeddings, and orchestrating foundational modelsProficiency in AWS Data Lake Architecture & StorageProficiency in Real-Time Observability & Log AnalyticsProficiency in Elasticsearch & OpenSearch Optimization, Vectorization, EmbeddingsProficiency in Amazon Bedrock & Generative AI PipelinesProficiency in Software Engineering & API IngestionProduction-level proficiency in one or more of: C# (.NET Core), Java, Python, or Node.jsPreferred ExperienceAWS S3 partitioning strategies, lifecycle policies, and columnar formats (Parquet, Iceberg)AWS Glue Data Catalog and Lake Formation for multi-tenant, fine-grained access controlQuery optimization over petabyte-scale datasets using Amazon Athena and Redshift SpectrumDistributed oTel collector configuration for log, trace, and metrics capture and routing into S3High-volume streaming of system logs, Datadog captures, and raw server events into S3Real-time CDC from PostgreSQL using Debezium or AWS DMSAmazon OpenSearch clusters with simultaneous lexical and high-dimensional vector searchOpenSearch index lifecycle management, sharding strategies, and dynamic mappings at scaleAmazon Bedrock foundational model APIs (Claude, Titan) for data enrichment, classification, and semantic parsingKnowledge Bases for Amazon Bedrock for automatic chunking, metadata extraction, and vector index syncs from S3ETL/ELT pipelines ingesting unstructured event data from SaaS APIs (e.g., Pendo, Hotjar, Google Analytics)MCP server development to expose data lake context and utilities to AI agentsBenefitsRemote work.13 floating holiday.15 vacation days per year completed.Good working environment.Every qualified candidate who meets the requirements outlined in the job description will be considered in this hiring process without distinction.Furthermore, Jalasoft is an equal opportunity employer. We wholeheartedly embrace our responsibility to make employment decisions without regard to race, age, marital or social status, national origin, disability, sex, gender identity or expression, or any other characteristic or group of candidates or employees unrelated to their qualifications and suitability for the position. Our management is committed to upholding this policy with respect.
Optionally, you can add more information later (benefits, pre-screening questions, etc.)
check_circle

Payment confirmed

A member of the Torre team will contact you shortly

In the meantime, continue adding information to your job opening.