Optimising HPC Workflows: Three case studies from a Research Software Engineer’s perspective

Research output: Contribution to conferencePaperpeer-review

Abstract

Ada is the University of East Anglia’s (UEA’s) High Performance Computing (HPC) cluster. Ada uses the SLURM job scheduler and comprises over 400 CPU nodes and 28 GPU nodes, each running the CentOS 7 operating system. There are nearly 400 users of Ada, including students at all levels of study through to postdoctoral researchers and faculty. The users span a range academic disciplines and vary widely in their computing proficiency.

The UEA’s HPC service has recently appointed a Research Software Engineer (RSE) to address the challenge of optimising users’ jobs to make more efficient use of the available resources. In this talk, I will present three case studies of users requesting assistance to optimise their computational workflows on Ada. I will show how their requests were presented, the solutions considered and selected, and quantify the speed improvements obtained. Solutions explored will range from identifying simple idiosyncrasies due to software versioning, through to utilising GPU technology more effectively.

I will show that impressive performance improvements were obtained, despite the chosen solutions being common and relatively simple to implement. I will motivate the need for HPC services to focus on optimising user workflows by discussing a range of direct and indirect benefits of this approach, such as reducing power consumption and HPC expenditure, and increasing user productivity and satisfaction.
Original languageEnglish
Publication statusPublished - 1 Dec 2022
EventComputing Insight UK 2022 : Sustainable HPC - Manchester Central Convention Complex, Manchester, United Kingdom
Duration: 1 Dec 20222 Dec 2022
https://www.scd.stfc.ac.uk/Pages/CIUK2022.aspx

Conference

ConferenceComputing Insight UK 2022
Abbreviated titleCIUK
Country/TerritoryUnited Kingdom
CityManchester
Period1/12/222/12/22
Internet address

Cite this