BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260522T162631Z
LOCATION:Bldg. 6 - Room 104
DTSTART;TZID=Europe/Stockholm:20260701T123000
DTEND;TZID=Europe/Stockholm:20260701T130000
UID:submissions.pasc-conference.org_PASC26_sess177_pap112@linklings.com
SUMMARY:Sampling Parallelism for Fast and Efficient Bayesian Learning
DESCRIPTION:Asena Karolin Özdemir, Lars Helge Heyen, Arvid Weyrauch, and A
 chim Streit (Karlsruhe Institute of Technology); Markus Götz (Karlsruhe In
 stitute of Technology, Helmholtz AI); and Charlotte Debus (Karlsruhe Insti
 tute of Technology)\n\nMachine learning models, and deep neural networks i
 n particular, are increasingly deployed in risk-sensitive domains such as 
 healthcare, environmental forecasting, and finance, where reliable quantif
 ication of predictive uncertainty is essential. However, many uncertainty 
 quantification (UQ) methods remain difficult to apply due to their substan
 tial computational cost. Sampling-based Bayesian learning approaches, such
  as Bayesian neural networks (BNNs), are particularly expensive since draw
 ing and evaluating multiple parameter samples rapidly exhausts memory and 
 compute resources. These constraints have limited the accessibility and ex
 ploration of Bayesian techniques thus far.<br>To address these challenges,
  we introduce sampling parallelism, a simple yet powerful parallelization 
 strategy that targets the primary bottleneck of sampling-based Bayesian le
 arning: the samples themselves. By distributing sample evaluations across 
 multiple GPUs, our method reduces memory pressure and training time withou
 t requiring architectural changes or extensive hyperparameter tuning. We d
 etail the methodology and evaluate its performance on a few example tasks 
 and architectures, comparing against distributed data parallelism (DDP) as
  a baseline. We further demonstrate that sampling parallelism is complemen
 tary to existing strategies by implementing a hybrid approach that combine
 s sample and data parallelism.<br>Our experiments show near-perfect weak s
 caling, confirming that sample evaluations parallelize cleanly. Although D
 DP achieves better raw speedups under strong scaling, sampling parallelism
  has a notable advantage: by applying independent stochastic augmentations
  to the same batch on each GPU, it increases augmentation diversity and th
 us reduces the number of epochs required for convergence.\n\nSession Chair
 : Aldas Lenkšas (Politecnico di Milano)\n\n
END:VEVENT
END:VCALENDAR
