BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260421T090515Z
LOCATION:Bldg. 6 - Room 102
DTSTART;TZID=Europe/Stockholm:20260701T153000
DTEND;TZID=Europe/Stockholm:20260701T160000
UID:submissions.pasc-conference.org_PASC26_sess161_msa113@linklings.com
SUMMARY:Growing up without growing old: the Quantum ESPRESSO GPU experienc
 e
DESCRIPTION:Laura Bellentani (CINECA)\n\nQuantum ESPRESSO (QE) is an open-
 source suite of first-principles electronic-structure and materials modeli
 ng codes based on DFT, plane waves, and pseudopotentials, grown into a lar
 ge international user base. Its development history, spanning more than tw
 o decades, is driven by two complementary goals: long-term sustainability 
 and the ability to leverage modern HPC architectures to foster production 
 at scale.\nThese goals have been addressed through modularization, to sepa
 rate method-specific routines from hardware-oriented modules and libraries
 ; an incremental porting to GPUs, which prioritizes portable directive-bas
 ed programming models over CUDA-centric solutions; source- or runtime-leve
 l optimizations to improve resource usage across the QE use cases. Example
 s presented in this talk include algorithms for small metallic systems wit
 h many k-points, exploiting GPU streams to increase kernel concurrency, an
 d the use of NVIDIA Multi-Process Service to reduce node requirements in p
 honon calculations with independent images. We will discuss how customizin
 g band batching in GPU drivers for Fourier-Transform distribution or band-
 parallel workloads can improve computation-communication overlap, together
  with the use of GPU-aware communication libraries such as HPCX-MPI and NC
 CL. Ongoing work on mixed- and reduced-precision GEMM operations using NVI
 DIA emulation tools aims to improve performance per watt while keeping pac
 e with evolving hardware trends.\n\nDomain: Chemistry and Materials, Compu
 tational Methods and Applied Mathematics\n\nSession Chairs: Iurii Timrov (
 Paul Scherrer Institut); Laura Grigori (PSI, EPFL); and Michael Herbst (EP
 FL)\n\n
END:VEVENT
END:VCALENDAR
