ETL Test Engineer
Validate large-scale data pipelines, ETL processes, and Big Data ecosystems — hands-on with Apache NiFi, Airflow, Databricks, Spark and SQL-based data validation.
Apply via emailJob Summary
We are seeking a highly skilled ETL Testing Engineer with strong experience in validating large-scale data pipelines, ETL processes, and Big Data ecosystems. The ideal candidate should have hands-on expertise in Apache NiFi, Apache Airflow, Databricks, Spark, and SQL-based data validation, along with a strong understanding of data warehousing and data lake concepts.
The role involves validating complex data transformation workflows, ensuring data quality and accuracy, and collaborating with cross-functional teams in an Agile environment.
Key Responsibilities
- Perform ETL testing to validate data extraction, transformation, and loading processes across multiple source and target systems.
- Validate Big Data pipelines handling structured and unstructured data in large-scale environments.
- Design, develop, review, and maintain test cases and test scenarios based on business and technical requirements.
- Perform data validation and reconciliation between source systems and target systems such as Data Lakes and Data Warehouses.
- Test and validate Apache NiFi workflows including data ingestion, routing, transformation, and error handling.
- Test Apache Airflow DAGs ensuring proper scheduling, dependency validation, and retry mechanisms.
- Perform Databricks testing including validation of Spark jobs (batch and incremental), notebooks, Delta tables, transformations, and business rules.
- Develop and execute automated ETL test scripts for Big Data pipelines wherever applicable.
- Execute SQL and Spark SQL queries for data quality validation, row count checks, aggregation validation, and business rule verification.
- Identify, log, track, and manage defects using defect management tools and collaborate with development teams for resolution.
- Participate in requirement analysis, test planning, and test strategy discussions.
- Validate performance, scalability, and reliability of data pipelines.
- Support regression testing and production data validation activities.
Required Skills & Qualifications
- Strong hands-on experience in ETL Testing / Data Testing.
- Experience with Big Data technologies such as HDFS, Hive, and Spark.
- Hands-on experience testing Apache NiFi data flows.
- Experience validating Apache Airflow workflows and DAGs.
- Strong experience in Databricks testing including Spark jobs, Delta Lake, and notebooks.
- Strong proficiency in SQL and Spark SQL.
- Experience in test case design, execution, and documentation.
- Good understanding of Data Warehousing concepts, Data Lakes, and Dimensional Modeling.
- Experience with data validation techniques including row count validation, checksums, and data profiling.
- Familiarity with automation frameworks for ETL / Big Data testing.
- Experience using defect tracking tools such as Jira.
- Good understanding of SDLC, STLC, and Agile methodologies.
Good to Have
- Experience with cloud platforms such as AWS, Azure, or GCP.
- Knowledge of CI/CD pipelines for data platforms.
- Experience with Python or Scala for test automation.
- Exposure to Kafka or streaming data testing.
Preferred Candidate Profile
- Strong analytical and problem-solving skills.
- Excellent communication and collaboration abilities.
- Ability to work independently in a fast-paced environment.
- Strong attention to detail and quality-focused mindset.