Kafka Connect Data Engineer
Job Description
Kafka Connect Data Engineer
Work Mode
Remote
Experience Required
5 Years
Budget
₹70,000 per month
Job Summary
We are looking for a skilled Kafka Connect Data Engineer with strong expertise in real-time data ingestion frameworks, cloud storage integrations, and modern data lake architectures.
The ideal candidate should have hands-on experience with Apache Kafka, Kafka Connect, Debezium, and cloud-based ingestion pipelines. This role involves designing scalable Bronze Layer architectures, building reliable ingestion workflows, and optimizing data processing systems for performance and cost efficiency.
Key Responsibilities
Data Ingestion & Pipeline Development
Design, develop, and maintain ingestion pipelines using:
Apache Kafka
Kafka Connect
Debezium
Build real-time and batch ingestion pipelines from:
MySQL
PostgreSQL
Integrate data into:
AWS S3
Google Cloud Storage (GCS)
BigQuery
Develop scalable Bronze Layer data architectures
Implement schema evolution and partitioning strategies
Platform Management & Optimization
Manage and monitor Kafka Connect clusters and connectors
Optimize ingestion pipelines for:
Scalability
Reliability
Performance
Cost efficiency
Ensure data quality and observability standards
Automation & Collaboration
Create automation scripts using Python
Work closely with Data Platform and Analytics teams
Analyze source systems and define downstream data requirements
Explore and evaluate emerging data lake technologies
Required Skills
Strong hands-on experience with:
Apache Kafka
Kafka Connect
Debezium (CDC)
Expertise in:
MySQL
PostgreSQL
Experience with cloud platforms and storage systems:
AWS S3
Google Cloud Storage (GCS)
BigQuery
Strong Python programming and automation skills
Understanding of:
Data Lake Architectures
ETL / ELT Pipelines
Data Quality
Observability
Good to Have Skills
Experience with:
Docker
Kubernetes
Airflow
Dagster
Knowledge of:
Terraform
CloudFormation
Exposure to:
Apache Hudi
Apache Iceberg
Preferred Candidate Profile
Strong analytical and troubleshooting skills
Experience handling enterprise-scale data ingestion systems
Ability to work independently in remote environments
Good collaboration and communication skills