BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260522T162632Z
LOCATION:Bldg. 6 - Room 103
DTSTART;TZID=Europe/Stockholm:20260701T123000
DTEND;TZID=Europe/Stockholm:20260701T130000
UID:submissions.pasc-conference.org_PASC26_sess176_pap106@linklings.com
SUMMARY:Accelerating Electrostatically-Embedded Fragmentation Methods usin
 g Graphics Processing Units
DESCRIPTION:Fazeleh Kazemian (The Monash University, The Australian Nation
 al University); Jorge Galvez-Vallejo (The Australian National University);
  and Giuseppe Barca (The Monash University, The Australian National Univer
 sity)\n\nPredicting the physicochemical properties of large molecular syst
 ems requires quantum chemistry methods that are both accurate and computat
 ionally scalable. Electrostatically embedded fragmentation approaches, suc
 h as the Fragment Molecular Orbital(FMO)method, have been highly successfu
 l in extending Hartree–Fock(HF)and post-HF theories to systems with thousa
 nds of atoms, but their iterative electrostatic potential(ESP)cycles and c
 ommunication patterns pose challenges on modern heterogeneous architecture
 s. In this work, we develop a distributed-memory, multi-GPU algorithms for
  electrostatically embedded fragmentation, targeting both FMO and the rece
 ntly proposed Coulomb-Perturbed Fragmentation (CPF) method. CPF removes th
 e expensive self-consistent ESP iterations at the monomer level by converg
 ing monomers once in vacuo and then using these fixed densities to constru
 ct the electrostatic embedding for all fragments. Our implementations in t
 he Extreme-Scale Electronic Structure System(EXESS)feature GPU-accelerated
  HF and RI-MP2 kernels, specialised ERI kernels for ESP terms, a multi-lay
 er dynamic load balancing scheme, and MPI Remote Memory Access (RMA) to ef
 ficiently distribute monomer densities across nodes. Accuracy is assessed 
 for water hexamers, neutral molecular crystals, and ionic liquids at the H
 F and RI-MP2 levels, where CPF3 and FMO3 reproduce full-system energies wi
 th mean absolute deviations of only a few kJ,mol^{-1} and correctly recove
 r subtle energetic orderings. Performance benchmarks on Gadi and Perlmutte
 r demonstrate speedups of up to $\sim$6$\times$ over the parallel CPU FMO 
 implementation in GAMESS on a single node, and strong scaling efficiencies
  approaching 90% on up to 128 GPU nodes. Overall, CPF emerges as a highly 
 accurate and markedly more scalable alternative to FMO for large-scale ele
 ctrostatically embedded quantum chemistry on GPU-accelerated supercomputer
 s\n\nSession Chair: Stanislaw Ostyk-Narbutt (Politecnico di Milano)\n\n
END:VEVENT
END:VCALENDAR
