Responsible for ensuring that data pipelines are deployed successfully into production environment
Deploy applications to AWS cloud leveraging on the full spectrum of the AWS cloud services
Automate data pipelines testing and deployment using CI/CD
Requirements
5 years of relevant work experience in deploying data pipelines in a production environment
Experience working in a multi-disciplinary team of machine learning engineers, data engineers, software engineers, product managers and subject domain experts
High proficiency in SQL and relational databases
High proficiency in at least 1 data pipeline orchestration tool (Eg. Airflow, Dagster)
High proficiency in Python and related data libraries (eg. Pandas)
Experience with Docker
Experience with CI/CD tools like Jenkins
Experience in Agile working environment
Experience with AWS cloud services like RDS, EKS, EMR, Redshift·
Experience with Snowflake· Experience with Flask deployment of micro-services, preferably FastAPI·
Experience with big data tools: Spark, Hadoop, Kafka etc.·
Experience with SQLAlchemy and Alembic libraries is a plus