About the Role:
- We are looking for a Senior Data Engineer to be a key technical contributor on our data platform at a critical moment of growth.
- You will design and build the data systems, pipelines, and models that power clinical operations, product analytics, and business intelligence across the organization.
- This is a highly technical, high-ownership role.
- You will work directly with engineering, product, and clinical teams and your work will have a direct impact on how we deliver care and makes decisions.
- You will bring both the craft and the judgment to do it right.
What You Will Do:
- ELT Pipeline Development: Design, build, and maintain reliable ELT pipelines using FiveTran, dbt, and our internal DataOps platform running on AWS EKS.
- Data Modeling: Build and maintain high-quality analytical data models in Databricks (Unity Catalog) using the right technique for the job: Star Schema, OBT, SCDs, and more.
- Data Quality & Governance: Implement data quality tests, maintain data documentation, and contribute to data governance standards across pipelines and models.
- Analytics Enablement: Partner with clinical and business stakeholders to build analytics-ready datasets and QuickSight dashboards that drive operational and clinical decision-making.
- Cross-Functional Collaboration: Translate business and product requirements into solid, well-documented data solutions.
- Work closely with backend engineers, DevOps/SRE, and product managers.
- Technical Ownership: Take end-to-end ownership of data domains.
- You are the person others turn to when something is broken or needs to be built right.
Our Stack:
- Data Platform: Databricks (SQL Warehouses, Unity Catalog), dbt Core / dbt Cloud.
- Orchestration: Custom DataOps platform on AWS EKS with KEDA autoscaling; handles replication, CDC, file ingestion, and dbt execution.
- Ingestion: FiveTran.
- Analytics / BI: Amazon QuickSight.
- Data Sources: PostgreSQL (Aurora), MySQL, Oracle, Salesforce, REST APIs, SFTP/S3.
- Cloud: AWS (EKS, S3, IAM, RDS Aurora, CloudFront).
- CI/CD: GitLab CI/CD.
- Languages: Python, SQL.
What We Are Looking For:
Required:
- 7+ years of experience in data engineering, analytics engineering, or data architecture.
- Healthcare domain expertise, specifically with HL7, FHIR, as they will handle PHI.
- Deep proficiency in SQL and Python.
- Strong dbt experience — models, tests, sources, macros, docs.
- Hands-on experience with Databricks or a comparable cloud data warehouse (Snowflake, BigQuery, Redshift).
- Experience with managed ingestion tools — FiveTran, Airbyte, or similar.
- Strong understanding of data modeling — Star Schema, OBT, SCDs, normalization/denormalization.
- Experience building and maintaining data quality tests and pipeline monitoring.
- Solid AWS experience — S3, IAM, RDS, and familiarity with containerized workloads.
- Familiarity with the U.S. healthcare system — pharma, payer, or health tech context required.
- Working knowledge of medical claims processing and clinical data workflows.
- Strong Git proficiency and collaborative development habits (code review, branching, CI/CD).
- Clear written and verbal communication — you can explain data decisions to non-technical stakeholders.
Preferred:
- Experience with GitLab CI/CD or similar (GitHub Actions, CircleCI).
- Hands-on experience with Amazon QuickSight or other BI/reporting tools.
- Experience with Change Data Capture (CDC) — timestamp-based, ID-based, or log-based.
- Familiarity with Spark on Databricks.
- Experience in both OLTP and OLAP environments.
- Experience with PHI/PII de-identification and HIPAA-compliant data handling.
- Startup experience — comfortable moving fast, wearing multiple hats, and adapting quickly.
- Experience with custom data orchestration platforms or frameworks.
What You Bring to the Table:
- You are a senior engineer who takes full ownership of your work — from the first pull request to production reliability.
- You care about clean data and clean code equally.
- You are at your best when given a hard problem, a collaborative team, and the room to do it right.
- You bring a point of view on data architecture decisions and are not shy about advocating for it.