Job Title: Senior Consultant | Java, Spring Boot, Rest API, Kafka, Microservices, Spring Data JPA, Oracle | Ben
ETL/ELT Development
- Develop, maintain, and optimize ETL/ELT pipelines using DBT & Python (must‑have).
- Source data via batch files, APIs, legacy upstream systems, and enterprise platforms.
- Implement robust ingestion processes with schema evolution handling, error management, and logging.
Cloud Ingestion & Integration (AWS & GCP)
- Build ingestion pipelines on AWS using Lambda, S3, Terraform, CloudWatch, and Step Functions.
- Build ingestion pipelines on GCP using Cloud Functions, Cloud Storage, Pub/Sub, and Cloud Composer.
- Engineer secure, scalable, and resilient integrations between cloud systems and on‑premise data sources.
Advanced Data Modelling
- Design and maintain logical, physical, ER, and dimensional data models for GCP BigQuery and Teradata environments.
- Develop scalable modelling patterns including Star/Snowflake schemas, 3NF, and Data Vault modelling.
- Optimize models using partitioning, clustering, indexing, and incremental load techniques.
- Build subject‑area layers and semantic datasets for analytics and operational reporting.
System-Level Data Sourcing & Integration
- Collaborate with upstream and enterprise system owners to understand source structures, API specs, event models, data contracts, and integration constraints.
- Develop end‑to‑end data sourcing strategies ensuring data freshness, quality, lineage traceability, and latency SLAs.
- Implement integrations across multi‑cloud, on-prem, and heterogeneous data systems.
- Implement CDC (Change Data Capture) patterns, event‑driven ingestion, and real-time streaming where applicable.
Data Transformation & Warehouse Delivery
- Deliver complete warehouse solutions from landing data in RAW zones to building highly curated trusted datasets.
- Build incremental transformations adhering to best-practice ELT patterns.
- Implement SCD Type 1/Type 2, surrogate key logic, and full historical tracking as needed.
CI/CD, Automation & Version Control
- Implement CI/CD pipelines using Codefresh or equivalent tools (GitHub Actions, GitLab CI/CD, Jenkins).
- Use Git for version control, branching, pull requests, and team collaboration.
- Automate deployments of data pipelines, models, and infrastructure using Terraform and cloud-native tools.
Data Governance & Quality
- Implement and maintain metadata, business glossary, data dictionaries, and reference data frameworks.
- Ensure strong data quality through validation rules, unit tests, and monitoring dashboards.
- Align all deliverables with enterprise data governance, auditability, and compliance guidelines.
Required Skills
- 8+ years of strong data engineering experience, with deep expertise in DBT, Python, SQL, and ELT/ETL patterns.
- Proven experience in AWS and GCP cloud ecosystems.
- Strong knowledge of data modelling, system integrations, and warehouse architecture.
- Hands-on experience with Terraform, serverless compute, orchestration tools, and version control.
- Excellent debugging skills and strong understanding of data quality, lineage, and large-scale data processing.
Graduation - B.E/B.Tech
Preferred Skills
- Experience with BigQuery, Teradata, Snowflake, Kafka, Airflow, or microservices‑based ingestion.
- Experience designing end‑to‑end enterprise data platforms with observability and alerting.