Job Description: Test Data Architect
This job posting is for a senior-level manager who specializes in automating the movement and transformation of data (ETL) within a banking environment.
We are seeking a highly skilled and self-driven Automation Manager to oversee and own design, build, and deploy of scalable ETL pipelines across hybrid environments including Cloudera Hadoop, Red Hat OpenShift, and AWS Cloud. This role focuses on developing robust PySpark-based data processing solutions, building testing frameworks for ETL jobs, and leveraging containerization and orchestration platforms like Docker and AWS EKS for scalable workloads.
You will be responsible for automating ETL processes, integrating with data lakes and data warehouses, managing large datasets efficiently, and ensuring reliable data delivery through CI/CD-enabled workflows.
What You'll Do (Developer Focus):
What You'll Do (Lead Focus):
Skillset:
• 12-15 years of experience on automation testing across UI, Data analytics and BI reports in the Financial Service industry especially with knowledge of regulatory compliance and risk management
• Extensive knowledge on developing and maintaining automation frameworks, AI/ ML related solutions
• Detailed knowledge data flows in relational database and Bigdata (Familiarity with Hadoop (a platform for processing massive datasets)).
• Selenium BDD Cucumber using Java, Python
• Strong experience with PySpark for batch and stream processing deploying PySpark workloads to AWS EKS (Kubernetes)
• Proficiency in working on Cloudera Hadoop ecosystem (HDFS, Hive, YARN)
• Hands-on experience with ETL automation and validation framework.
• Strong knowledge of Oracle SQL and HiveQL
• Familiarity with Red Hat OpenShift for container-based deployments
• Proficient in creating Dockerfiles and managing container lifecycle
• Solid understanding of AWS services like S3, Lambda, EKS, Airflow, and IAM
• Experience with Airflow DAGs to orchestrate ETL jobs
• Familiarity with CI/CD tools (e.g., Jenkins, GitLab CI)
• Scripting knowledge in Bash, Python, and YAML
• Version Control: GIT, Bitbucket, GitHub
• Experience on automating BI reports e.g., Tableau dashboards and views validation
• Hands on experience in Python for developing utilities for Data Analysis using Pandas, NumPy etc
• Experience with mobile testing using perfecto, API Testing-SoapUI, Postman/Rest Assured, SAS Tools will be added advantage
• Strong problem-solving and debugging skills
• Excellent communication and collaboration abilities to lead and mentor a large techno-functional team across different geographical locations
Strong Acumen and great presentation skills
• Able to work in an Agile environment and deliver results independently
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Technology Quality------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Want to receive frequent updates by email? Subscribe to our automatic job service!