31 Jan
ZORTech Solutions
Old Toronto
Big Data ETL Tester (Python / PySpark)
Location: Toronto, ON (Onsite/Hybrid)
Experience: 8+ year
Employment Type: Contract
Job Summary:
We are seeking an experienced Big Data ETL Tester with expertise in Python / PySpark to support a large-scale Big Data (Hadoop/Cloudera) conversion to Python / PySpark on Cloud (AWS, Azure). The ideal candidate will have strong experience in ETL testing, data validation, test automation, and performance testing for cloud-based big data transformations.
Key Responsibilities:
1. Validate data migration from Hadoop/Cloudera to Python / PySpark (Databricks, Snowflake, etc.) on AWS / Azure Cloud.
2. Develop and execute test cases for ETL pipelines, data transformations, and data quality checks.
3.
Perform data reconciliation to ensure data accuracy and integrity after migration.
4. Automate test scripts using Python / PySpark to validate large datasets.
5. Conduct performance testing of ETL jobs and optimize test execution.
6. Collaborate with developers, data engineers, and business analysts to define testing strategies.
7. Troubleshoot and resolve data inconsistencies, schema mismatches, and performance issues.
8. Ensure compliance with data governance, security policies, and regulatory requirements.
Preferred Skills:
1. Experience with Spark SQL, Hive, Kafka, Airflow
2. Hands-on knowledge of data warehousing concepts and data lakes
3. Familiarity with CI/CD pipelines for test automation
4. Knowledge of BI reporting tools (Tableau, Power BI)
#L!-CEIPAL
Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.