Lead Cloud HPC DevOps Engineer

Job Description

  • Job Title Lead Cloud HPC DevOps Engineer

Required Skills




Ansys is the global leader in engineering simulation, helping the world's most innovative companies deliver radically better products to their customers. By offering the best and broadest portfolio of engineering simulation software, Ansys helps companies solve the most complex design challenges and engineer products limited only by imagination.

Summary / Role Purpose

The Cloud HPC DevOps Engineer is responsible for the planning, development and operation of the Ansys internal cloud HPC resources. The engineer, in conjunction with other stakeholders, manages the demand pipeline, contributes to the case for change, then leads the engineering effort to enhance the platform. The engineer also leads delivery activities, aligning technical implementation and work practices to company standards.

The successful candidate will demonstrate a strong commitment to innovation, collaborative working practices and a passion for the core technology. You will have outstanding technical skills, a deep understanding of HPC paradigms and cloud platforms, along with DevOps working practices. You will be recognized as a subject matter expert and able to articulate best working practices to a highly technical user base. You will have extensive experience of operational support, IT best practice and governance.

Key Duties and Responsibilities

Lead the implementation of the technical strategy, with elements of design, build and operations Provide technical architecture input into strategic designs
Deliver cloud HPC platforms that conform with the design
Provide ongoing operation support for cloud HPC platforms

Contribute to resource planning activities, providing detailed estimates for effort, platform and ongoing costs
Ensure automation, infrastructure-as-code and DevOps best practices are integrated in the design, delivery and operation of cloud HPC platforms
Take ownership of service issues, using sound judgement to propel issues to resolution
Work with peers and SMEs to create complete technical implementations
Engage peers to review your project artefacts, incorporate feedback and refine outputs accordingly
Develop and communicate task level detail for project plans
Prepare budgetary recommendations as needed
Must be able to effectively coordinate and multi-task across various groups and functional teams both inside and outside of the organization.
Mentor peers in best practice techniques, methods, standards and processes
Provide leadership and guidance to staff, fostering an environment that encourages employee participation, teamwork, and communication.

Minimum Education/Certification Requirements and Experience

Bachelor's degree in computer science, business, management disciplines, or related field
5 years of HPC operational experience
8 years of experience in technical roles
Extensive experience of traditional on-premise HPC platforms, including hardware, job schedulers, high performance networks, GPU compute, high performance shared storage platforms and remote visualization platforms. Schedulers: SLURM, UGE, LSF, PBS Pro and techniques/platforms relating to autoscaling
Storage: Lustre, BeeGFS, Ceph
Visualization: DCV, VNC, RGS, Citrix, VMware
Hardware: Dell, NVIDIA, Cray
Cloud: Azure, AWS, GCP, Oracle
DevOps: Git, ADO, VSC, Ansible, Terraform, CI/CD

There is something wrong with this job ad? Report the error