Overview

Associate Director, Operations – GPU Cloud

Date: 29 Nov 2024

Location:

Singapore, Singapore

Company:
Singtel Group

Be a Part of Something BIG!

 

Make an Impact by

 

To lead and manage the GPU Infrastructure-as-a-Service (IaaS) platform. This role will oversee the GPU infrastructure, storage infrastructure and associated services, ensuring seamless integration and operation.

 

Infrastructure and Resource Management:

  • Manage the maintenance and operations of Data centre with liquid cooling setup that hosts the GPU cloud.
  • Optimization of GPU infrastructure and associated hardware.
  • Optimize resource allocation to meet the performance requirements of both data centre operations and cloud hardware operations, as well as cost-effectiveness goals.
  • Lead the operations team to ensure compliance to the SLA needs of customers and the product.
  • Enhance system scalability and reliability through automation and continuous improvements. Enforce industry-standard operational process with reference to standards like ISO 27001 or equivalent in the data centre and cloud operations

Operational Excellence:

  • Handle general incidents, including operations management and escalation management across the AI cloud product.
  • Develop and implement operational strategies to ensure the reliability and efficiency of our GPU Cloud infrastructure.
  • Collaborate with other departments to streamline processes, enhance customer experience, and meet service level agreements.
  • Support services and improve the lifecycle of GPU cloud hardware and the data centre environment with monitoring, logging, and alerting through deployment, operation, and refinement.
  • Establish Ops systems/processes (SOPs, EOPs etc) and to manage daily operational issues.
  • Possess strong operational management skill set, which involves organising the internal cross functional teams and external vendors to ensure an efficient and resilient ops setup.

Team Management:

  • Build and lead a high-performing operations team to foster a culture of innovation, collaboration, and continuous improvement.
  • Set clear goals and objectives, mentor team members, and drive professional development initiatives.
  • Oversee resource management and allocation to optimize team productivity and effectively meet operation goals.

Security and Compliance:

  • Lead security incident management processes, focusing on identification, containment, and resolution of threats in the data center environment and GPU cloud hardware.
  • Enforce best practices for security and compliance.
  • Stay abreast of industry security trends and implement measures to safeguard customer data and platform integrity.

 

Skills for Success

 

  • Proven track record of managing and escalating complex cloud and data centre infrastructure issues and leading operation teams.
  • Experience in liquid cooling operations would be great
  • Strong understanding of hardware infrastructure operation, security, management, and best practices.
  • Excellent leadership, communication, and interpersonal skills, with the ability to lead cross-functional teams.
  • Proficiency in managing customer interactions and improving service delivery to enhance customer experience.
  • Experienced in Linux and hypervisor administration for GPU infrastructure and cloud.
  • Complex technical problem-solving with a proactive approach to system operation and optimization.
  • Knowledge of storage technologies and experience in capacity planning, troubleshooting, and data protection.
  • Experience in GPU and GPU infrastructure management, including configuration, monitoring, and performance.

 

Rewards that Go Beyond

  • Flexible work arrangements
  • Full suite of health and wellness benefits 
  • Ongoing training and development programs 
  • Internal mobility opportunities

 

Your Career Growth Starts Here. Apply Now!

 


About Singtel

Headquartered in Singapore, Singtel has 140 years of operating experience and played a pivotal role in the country’s development as a major communications hub. Optus, our subsidiary in Australia, is a leader in integrated telecommunications, constantly raising the bar in innovative products and services.

We are also strategically invested in leading companies in Asia and Africa, including Bharti Airtel (India, South Asia and Africa), Telkomsel (Indonesia), Globe Telecom (the Philippines) and Advanced Info Service (Thailand). We work closely with our associates, leveraging our scale in networks, customer reach and extensive operational experience to lead and shape the communications industry.

Together, the Group serves over 700 million mobile customers around world. Singtel is one of the largest listed Singapore companies on the Singapore Exchange by market capitalisation.

The Group has a vast network of offices throughout Asia Pacific, Europe and the USA, and employs more than 23,000 staff worldwide.