RESPONSIBILITIES
Data Architecture and Design:
Lead the design and implementation of scalable, efficient, and robust data architectures to meet business needs and analytical requirements.
Collaborate with stakeholders to understand data requirements, build subject matter expertise, and define optimal data models and structures.
2.Data Pipeline Development and Optimization:
Design and develop data pipelines, ETL processes, and data integration solutions for ingesting, processing, and transforming large volumes of structured and unstructured data.
Optimize data pipelines for performance, reliability, and scalability.
3.Database Management and Optimisation:
Oversee the management and maintenance of databases, data warehouses, and data lakes to ensure high performance, data integrity, and security.
Implement and manage ETL processes for efficient data loading and retrieval.
4.Data Quality and Governance:
Establish and enforce data quality standards, validation rules, and data governance practices to ensure data accuracy, consistency, and compliance with regulations.
Drive initiatives to improve data quality and documentation of data assets.
5.Mentorship and Leadership:
Provide technical leadership and mentorship to junior team members, assisting in their skill development and growth.
Lead and participate in code reviews, ensuring best practices and high-quality code.
6.Collaboration and Stakeholder Management:
Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand their data needs and deliver solutions that meet those needs.
Communicate effectively with non-technical stakeholders to translate technical concepts into actionable insights and business value.
7.Performance Monitoring and Optimization:
Implement monitoring systems and practices to track data pipeline performance, identify bottlenecks, and optimize for improved efficiency and scalability.
8.Common Software Engineering Requirements
You actively contribute to the end-to-end delivery of complex software applications, ensuring adherence to best practices and high overall quality standards.
You have a strong understanding of a business or system domain with sufficient knowledge & expertise around the appropriate metrics and trends. You collaborate closely with product managers, designers, and fellow engineers to understand business needs and translate them into effective software solutions.
You provide technical leadership and expertise, guiding the team in making sound architectural decisions and solving challenging technical problems. Your solutions anticipate scale, reliability, monitoring, integration, and extensibility.
You conduct code reviews and provide constructive feedback to ensure code quality, performance, and maintainability. You mentor and coach junior engineers, fostering a culture of continuous learning, growth, and technical excellence within the team.
You play a significant role in the ongoing evolution and refinement of current tools and applications used by the team, and drive adoption of new practices within your team.
You take ownership of (customer) issues, including initial troubleshooting, identification of root cause and issue escalation or resolution, while maintaining the overall reliability and performance of our systems.
You set the benchmark for responsiveness and ownership and overall accountability of engineering systems.
You independently drive and lead multiple features, contribute to (a) large project(s) and lead smaller projects. You can orchestrate work that spans multiples engineers within your team and keep all relevant stakeholders informed. You support your lead/EM about your work and that of the team, that they need to share with the stakeholders, including escalation of issues
REQUIREMENTS
Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
5+ years of experience in data engineering, with a focus on data architecture, ETL, and database management.
Proficiency in programming languages like Python/Pyspark and Java /Scala
Expertise in big data technologies such as Hadoop, Spark, Kafka, etc.
In-depth knowledge of SQL and experience with various database technologies (e.g., PostgreSQL, MySQL, NoSQL databases).
Experience and expertise in building complex end-to-end data pipelines.
Experience with orchestration and designing job schedules using the CICD tools like Jenkins and Airflow.
Ability to work in an Agile environment (Scrum, Lean, Kanban, etc)
Ability to mentor junior team members.
Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services (e.g., AWS Redshift, S3, Azure SQL Data Warehouse).
Strong leadership, problem-solving, and decision-making skills.
Excellent communication and collaboration abilities.