AI Performance Engineer Description Our clients are a leading technology company specialising in the design and development of cutting-edge, customised server hardware solutions optimised for artificial intelligence and machine learning applications. Their mission is to empower businesses and researchers to accelerate their AI initiatives by providing them with high-performance, scalable, and energy-efficient hardware infrastructure. As a rapidly growing company at the forefront of AI hardware innovation, they are constantly seeking talented and motivated individuals to join their team. They offer a dynamic and challenging work environment, with opportunities to make a significant impact on the future of AI technology. This is an AI Performance Engineer role in which you will evaluate and improve performance of various hardware platforms and software technologies. Responsibilities Responsibilities Profile and enhance the performance of AI workloads across various hardware platforms and machine learning frameworks Identify appropriate workloads and micro-benchmarks to be used for performance analysis Follow industry trends in AI algorithms and models and evaluate them on the companys hardware Experiment with various model optimization techniques e.g. quantization, pruning, compression, etc. Diagnose and troubleshoot complex heterogeneous computing systems and software issues Collaborate with software and firmware teams to ensure that the system meets end to end application performance goals while maintaining ease and efficiency of software development Requirements 5+ years of experience with C++ and Python programming Hands on experience the internals of deep learning frameworks (e.g. PyTorch, TensorFlow) and deep learning models General experience with the training and deployment of ML models Experience with distributed systems development, parallel programs or distributed ML workloads Knowledge of best-practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning Deep understanding of CPU and GPU or custom ASICs (NPU, TPU, etc.) architectures and low-level optimization techniques, memory hierarchy, instruction scheduling, and performance trade-offs Experience with LLM development and/or deployment is a strong plus A Compensation package includes relocation tickets (incl. family), visas and insurance Please contact Mano Caderamanpulle for a full discussion
Beware of fraud agents! do not pay money to get a job
MNCJobsGulf.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.