10+ years of hands-on experience developing and applying data-driven solutions in a corporate or consulting setting
A MSc in machine learning, computer engineering, computer science, or related fields
Experience working with Big Data technologies such as Hadoop, Spark, Kafka
Strong experience in SQL (MS SQL, MySQL .. etc)
Delivery of Data Lake / Big Data projects (including data ingestion, machine learning model application, code deployment)
Experience with Python, Cloud infrastructure (Azure, GCP) and DevOps (Docker, CI/CD)
Experience with Cloud infrastructure services (Azure Synapse, Azure Data Factory, Azure Data Lake Storage, Data Bricks, Azure SQL DB, Azure SQL Warehouse, Azure Virtual Machines)
Experience with the Google stack (BigQuery, Google Analytics) and web development (JavaScript) is a plus
Team player with skills to pull in relevant team members to address identified needs
Strong desire to learn and develop within a dynamic organization
Critical thinking and creative problem-solving skills, including the ability to structure and prioritise an approach for maximum impact
Project management skills (Agile)
Previous experience in real estate is a plus
Key Accountabilities & Tasks
Work with Head of Data Management to ensure integrity and consistency of data and that underlying data infrastructure is up to the standard.
Support Head of Data Management with collating data for proposals and recommendations.
Understand complexity of operations and data collected and propose methods to improve architecture gaps for data collection, aggregation, and exposition for different analytics purposes.
Work with vendors and consultants to understand their infrastructure tools and methodologies.
Integrate and maintain data sources into the Data Lake.
Create optimised Data Marts for specific business use cases that will be feed BI platform (Power BI)
Create data models that are required for the business
Work with IT Operations to maintain the data infrastructure (e.g. Data Lake)
Ensure all data pipelines are working and data lineage is accurate.
Ensure all SLAs for the data pipelines are all met in time.
Deploy ML models in production environment.
Ensure all new and old queries within Data Lake pipelines are optimised for efficient performance.
Ensure that all automated reports (daily, weekly and monthly) are all up and running.
Perform detailed research on latest ETLs/ELTs frameworks and languages that can support business requirements.
Design, build, and maintain communication between services using APIs and REST APIs.
Test and maintain data infrastructure by performing stress tests, UATs, etc.