, you will be a foundational architect, responsible for designing, building, and maintaining the robust, scalable, secure, and high-performance data infrastructure that powers our AI/ML initiatives. You will work closely with Data Scientists, AI Engineers, AIOps/MLOps Engineers, and Software Engineers to ensure the seamless flow of data from diverse government sources into our AI/ML pipelines, enabling the development and deployment of impactful intelligent applications across Abu Dhabi.
What You'll Do:
?
Design, build, and maintain scalable and reliable data pipelines
for ingesting, transforming, and storing large and complex datasets from various government sources (structured, semi-structured, and unstructured).
?
Architect and implement data warehousing and data lake solutions
on cloud platforms (primarily Azure) to provide a centralized and accessible data foundation for AI/ML and analytics.
?
Develop and optimize ETL/ELT processes
using industry-standard tools and technologies (e.g., Azure Data Factory, Databricks, Kafka) to ensure data quality, integrity, and efficient processing.
?
Implement and manage data governance policies and procedures
, ensuring data security, compliance with government regulations, and adherence to data quality standards.
?
Collaborate with Data Scientists and AI Engineers
to understand their data requirements and build data pipelines that meet their specific needs for model development and evaluation.
?
Work closely with AIOps/MLOps Engineers
to ensure seamless integration of data pipelines with AI/ML workflows and deployment processes.
?
Design and implement data monitoring and alerting systems
to proactively identify and resolve data quality issues and pipeline failures.
?
Optimize data storage and retrieval systems
for performance and cost-efficiency, leveraging appropriate database technologies (relational, NoSQL, vector, graph).
?
Implement data security measures
including access control, encryption, and data masking to protect sensitive government data.
?
Develop and maintain data catalogs and data lineage solutions
to ensure data discoverability, understand data flow, and facilitate data governance.
?
Evaluate and adopt new data engineering tools and technologies
to continuously improve our data infrastructure and processes.
?
Troubleshoot and resolve data-related issues
across the data lifecycle, ensuring data availability and reliability for AI/ML and analytics.
?
Document data pipelines, data models, and data engineering processes
clearly and concisely for both technical and non-technical stakeholders.
What You'll Bring:
? Bachelor's or Master's degree in Computer Science, Data Engineering, or a related technical field with a strong focus on data management and infrastructure.
? At least 3 years of hands-on experience in designing, building, and maintaining data pipelines and data infrastructure in a production environment.
? Deep understanding of data warehousing concepts, data modeling techniques (e.g., Kimball, Inmon), and data lake architectures.
? Proven experience with cloud platforms, especially Microsoft Azure, and their data engineering services (e.g., Azure Data Factory, Azure Synapse Analytics, Azure Data Lake Storage, Azure Databricks).
? Extensive experience with ETL/ELT tools and technologies (e.g., Apache Spark, Apache Kafka, SQL).
? Strong proficiency in SQL and experience with various database systems, including:
(e.g., Milvus, Pinecone, ChromaDB - understanding their data ingestion and retrieval aspects).
?
Graph Databases:
(e.g., Neo4j, ArangoDB - understanding their data modeling and querying aspects).
? Strong scripting and programming skills in languages such as Python or Scala for data processing and automation.
? Experience with data governance frameworks, data quality management, and data security principles.
? Experience with data monitoring and alerting tools.
? Excellent problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues.
? Strong collaboration and communication skills, with the ability to work effectively with Data Scientists, AI Engineers, AIOps/MLOps Engineers, and other stakeholders.
? A proactive and detail-oriented approach with a strong focus on building robust and reliable data infrastructure.
Bonus Points For:
? Experience with data virtualization technologies.
? Experience with building real-time data pipelines.
? Knowledge of data lineage and data catalog tools (e.g., Azure Purview).
? Experience with implementing data mesh concepts.
? Familiarity with infrastructure-as-code (IaC) tools (e.g., Terraform, ARM templates) for data infrastructure provisioning.
? Certifications in relevant cloud platforms or data engineering technologies.
? Experience working with government data regulations and compliance standards.
Job Type: Full-time
Pay: From AED20,000.00 per month
Ability to commute/relocate:
Abu Dhabi: Reliably commute or planning to relocate before starting work (Required)
Experience:
data pipelines and data infra in a production env: 5 years (Required)
ETL/ELT tools and tech, Apache Spark, Apache Kafka, SQL: 5 years (Required)
AI/ML development : 3 years (Required)
License/Certification:
* Emirates ID (Required)
Beware of fraud agents! do not pay money to get a job
MNCJobsGulf.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.