Dynamic Resource Management for Next-Gen HPC Applications and Architectures

The current static usage model of HPC systems is becoming increasingly inefficient. This is driven by the continuously growing complexity and heterogeneity of system architectures, in combination with the increased usage of coupled applications, the need for strong scaling with extreme scale parallelism, and the increasing reliance on complex and dynamic workflows. As a consequence, we see a rise in research on malleable systems, middleware software and applications, which can adjust resources usage dynamically in order to extract a maximum of efficiency. By providing an intelligent global coordination of resources usage, through runtime scheduling of computation, network usage and I/O across all components of the system architecture, malleable HPC systems can maximize the exploitation of their resources, while at the same time minimizing the makespan of applications in many, if not most, cases. Such malleable systems, however, face a series of fundamental research challenges, including: who initiates changes in resource availability or usage? How is it communicated? How to compute the optimal usage? How can applications cope with dynamically changing resources? What should malleable programming models and abstractions look like? How to design resource management frameworks for malleable systems? Which resources benefit from malleability and which (if any) should still be managed statically? This lecture will present the state of the art in malleability for HPC systems and will address the former question and some possible solutions to them.

Speaker

Jesus Carretero Perez – Universidad Carlos III de Madrid

Jesus Carretero is a Full Professor of Computer Architecture and Technology at Universidad Carlos III de Madrid (Spain) and leader of the Computer Architecture Research Group (ARCOS). His research activity is centered on high-performance computing systems, large-scale distributed systems, data-intensive computing, IoT and real-Prof. Carretero is Associated Editor of the journals ACM Computer Surveys, Future Generation Computing Systems, and Transactions on Parallel and Distributed Systems. He has published more than 300 papers in journals and international conferences, editor of several books of proceedings, and guest editor for special issues of journalstime systems. He has coordinated several EU and Spanish projects realed to HPC. He has participated in many conference organization committees, and he has been General chair of EUROPAR 2024, CCGRID 2017, IC3PP2016, or HPCC 2011, and Program Chair of ISPA 2012, EuroMPI 2013, and Applications track chair of SC22.

Event Timeslots (1)

Wed 18 – Programming Models & Tools
-
J. Carretero Perez (Universidad Carlos III de Madrid)