Position Name - Senior ETL Developer
Experience - 6+ years
No of positions - 1
Location - Coimbatore
Position Purpose
We are looking for an experienced Sr ETL Developer withstrong expertise in Apache Airflow, Redshift, and SQL-based datapipelines, with upcoming transitions to Snowflake. This is a contractrole based in Coimbatore, ideal for professionals who can independently deliver high-quality ETL solutions in a cloud-native, fast-paced environment.
Candidate Specification
- 6+ years of hands-on experience in ETL development.
- Proven experience with Apache Airflow, Amazon Redshift, and strong SQL.
- Strong understanding of data warehousing concepts and cloud-based data
ecosystems. - Familiarity with handling flat files, APIs, and external sources.
- Experience with job orchestration, error handling, and scalable transformation
patterns. - Ability to work independently and meetdeadlines.
- Exposure to Snowflake or plans to migrate to Snowflake platforms.
- Experience in healthcare, life sciences, or regulated environments is a plus.
- Familiarity with Azure Data Factory, Power BI, or other cloud BI tools.
- Knowledge of Git, Azure DevOps, or other version control and CI/CD platforms.
Roles and Responsibilities
1. ETL Design and Development:
- Design and develop scalable and modular ETL pipelines using Apache Airflow, with orchestration and monitoring capabilities.
- Translate business requirements into robust data transformation pipelines across cloud data platforms.
- Develop reusable ETL components to support a configuration-driven architecture.
2. Data Integration and Transformation:
- Integrate data from multiple sources: Redshift, flat files, APIs, Excel, and relational databases.
- Implement transformation logic such as cleansing, standardization, enrichment, and deduplication.
- Manage incremental and full loads, along with SCD handling strategies.
3. SQL and Database Development:
- Write performant SQL queries for data staging and transformation within Redshift and Snowflake.
- Utilize joins, window functions, and aggregations effectively.
- Ensure indexing and query tuning for high-performance workloads.
4. Performance Tuning:
- Implement best practices in distributed data processing and cloud-native optimizations.
- Tune SQL queries and monitor execution plans.
- Optimize data pipelines and orchestrations for large-scale data volumes.
5. Error Handling and Logging:
- Implement robust error handling and logging in Airflow DAGs.
- Enable retry logic, alerting mechanisms, and failure notifications.
6. Testing and Quality Assurance:
- Conduct unit and integration testing of ETL jobs.
- Validate data outputs against business rules and source systems.
- Support QA during UAT cycles and help resolve data defects.
7. Deployment and Scheduling:
- Deploy pipelines using Git-based CI/CD practices.
- Schedule and monitor DAGs using Apache Airflow and integrated tools.
- Troubleshoot failures and ensure data pipeline reliability.
8. Documentation and Maintenance:
- Document data flows, DAG configurations, transformation logic, and operational procedures.
- Maintain change logs and update job dependency charts.
9. Collaboration and Communication:
- Work closely with data architects, analysts, and BI teams to define and fulfill data needs.
- Participate in stand-ups, sprint planning, and post-deployment reviews.
10. Compliance and Best Practices:
- Ensure ETL processes adhere to data security, governance, and privacy regulations (HIPAA, GDPR, etc.).
- Follow naming conventions, version control standards, and deployment protocols.