BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260625T133337Z
LOCATION:Bldg. 6 - 004
DTSTART;TZID=Europe/Stockholm:20260629T133000
DTEND;TZID=Europe/Stockholm:20260629T140000
UID:submissions.pasc-conference.org_PASC26_sess101_msa200@linklings.com
SUMMARY:Improving Performance of Large-Scale SYCL Applications by Leveragi
 ng AdaptiveCpp's SSCP Compiler at the Example of GROMACS
DESCRIPTION:Bálint Soproni and Aksel Alpay (Heidelberg University)\n\nAdap
 tiveCpp, a vendor independent, production ready, implementation of the SYC
 L standard, enables applications to target a wide range of hardware archit
 ectures, including most recent accelerators from AMD, NVIDIA and Intel. Th
 e implementation provides multiple compilation paradigms, in particular SS
 CP (single-source, single compiler pass) and SMCP (single-source, multiple
  compiler passes). Even though the default SSCP JIT compiler of AdaptiveCp
 p delivers substantial speedups, systematic performance evaluations have f
 ocused mostly on small to medium-sized applications.\n\nIn this talk, we p
 resent the results of introducing the AdaptiveCpp JIT compiler to a highly
 -optimized, production code base: GROMACS - a widely used molecular dynami
 cs software package that currently relies on SYCL and the AdaptiveCpp SMCP
  compiler to target AMD GPUs. We discuss the necessary code changes for le
 veraging the SSCP compiler and evaluate the ported application across a va
 riety of input problems covering common simulation scenarios on MI210, MI3
 00A, and MI300X AMD GPUs. We find that the SSCP JIT compiler outperforms t
 he currently used SMCP AdaptiveCpp compiler in high-atom-count workload co
 nfigurations by up to 10-25% and increases the peak simulation throughput 
 of each tested GPU by up to 10%, measured in terms of simulated atoms per 
 second.\n\nDomain: Engineering, Computational Methods and Applied Mathemat
 ics\n\nSession Chair: Andrey Alekseenko (KTH Royal Institute of Technology
 , PDC Center for High Performance Computing)\n\n
END:VEVENT
END:VCALENDAR
