Role Summary:
Technical resource on the TME product team, partnering with a Project Manager. Builds and maintains data extraction and processing pipelines that transform EU tariff measures data (customs duties, tariffs, VAT, excise taxes) from government sources into structured, client-ready formats. Works using AI-assisted development tools and the BMAD methodology. This is a senior-level position requiring independent architecture and delivery decisions.
What You Will Do:
- Develop AI native applications that extract and process trade compliance data from central EU and member state government sources — both structured (XML, JSON, CSV, HTML tables, APIs) and unstructured (PDF documents, legal text, prose regulations, scanned publications).
- Design and built AI native data pipelines that monitor, detect changes, extract, normalise, validate, and release tariff measures data to clients.
- Integrate AI/LLM capabilities into extraction workflows — using large language models for document understanding, entity extraction, classification, and data structuring.
- Design and maintain data models for customs duties, VAT/excise information, and tariff measure metadata.
- Ensure data quality through automated testing, benchmarking against official sources, data source and legal source comparison, and data validation pipelines.
Required Technical Skills:
- 3+ years of professional software development.
- Proficiency in at least two programming languages — e.g., Python, JavaScript/TypeScript, Java, Go, C#, Rust, Kotlin, or similar. We value strong fundamentals over specific language experience.
- Methodology / Framework: Bmad Method.
- Data extraction and processing — web scraping, document parsing, API integration, ETL/ELT pipelines. Must be comfortable with both:
- Structured data: XML, JSON, CSV, HTML tables, databases, REST/SOAP APIs.
- Unstructured data: PDFs, legal text, prose regulations, HTML without clear structure, scanned documents.
- Database design and querying — relational (PostgreSQL or similar) and/or document-based databases. Schema design, migrations, indexing.
- API development — building and consuming RESTful APIs.
- Version control — Git workflow, branching strategy, code review.
- Automated testing — unit, integration, and data validation tests as part of development workflow.
- Self-sufficiency – ability to analyse, design and build complete solution.
- Rapid development – segment development into phasing to get results faster.
- Influencer – demonstrate approach to other team members to lift group maturity.
- Communication – can work with technical, business and project team members, participates in and leads discussion.
Preferred Technical Skills:
- LLM/AI API integration (OpenAI, Anthropic Claude, Google Gemini) — for data processing, document understanding, or content extraction.
- AI-assisted development tools (Claude Code, Cursor, GitHub Copilot, Windsurf, or similar).
- NLP / document processing — OCR, text extraction, entity recognition, text classification.
- Graph databases (Neo4j) or vector databases (pgvector).
- Observability and monitoring (SigNoz, Grafana, Langfuse, or similar).
- Containerisation and deployment (Docker, CI/CD pipelines).