BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260605T154543Z
LOCATION:Bldg. 6 - Room 004
DTSTART;TZID=Europe/Stockholm:20260629T160000
DTEND;TZID=Europe/Stockholm:20260629T180000
UID:submissions.pasc-conference.org_PASC26_sess159@linklings.com
SUMMARY:MS2C - Serving Inference: Leveraging HPCs in the Age of Generative
  AI
DESCRIPTION:Organizer(s): Tobias Hodel (University of Bern, Switzerland), 
 and Sukanya Nath (University of Bern)\n\nGenerative AI is reshaping how sc
 ientific outputs are produced, yet most widely used tools are operated by 
 commercial providers whose practices around processing, generating, and st
 oring user inputs are often unclear. This lack of transparency raises seri
 ous concerns for research organizations handling sensitive or regulated da
 ta and complicates the responsible use of even non-regulated scientific co
 ntent. At the same time, many commercially served models are closed-source
  and subject to frequent, opaque updates, limiting reproducibility and und
 ermining alignment with FAIR principles. As open-weight and open-source mo
 dels proliferate, HPC infrastructures are emerging as a promising alternat
 ive for hosting and providing controlled access to AI within research envi
 ronments. However, most HPC systems were not designed for continuous, GPU-
 based inference services, creating technical and operational challenges sp
 anning deployment, scheduling, reliability, user access, and governance. C
 onsequently, institutions are developing ad hoc solutions with limited opp
 ortunities to exchange patterns and lessons learned. This minisymposium co
 nvenes a panel of institutions actively building such capabilities to comp
 are approaches and discuss best practices, minimal viable solutions, and i
 deal configurations for serving generative AI on HPC. To broaden participa
 tion, we will use a short questionnaire to structure contributions across 
 four dimensions: technical setup, usage policies, documentation practices,
  and monitoring/oversight mechanisms.\n\nA Human-in-the-Loop Scoping Revie
 w Screening Pipeline Using Self-Hosted Large Language Models: An Example i
 n Sport Science\n\nScoping reviews are labor-intensive efforts to screen a
 nd extract paper data. Large Language Models (LLMs) seem to offer efficien
 cy gains. But to address data privacy and reproducibility issues with clou
 d-based LLMs, and adhering to JBI/PRISMA-ScR guidelines, we present a huma
 n-in-the-loop review pi...\n\n\nKai Michael Gensitz (University of Bern); 
 Shawan Mohammed (RWTH Aachen University); Daniela E. Ströckl (Carinthia Un
 iversity of Applied Sciences); Marc Augustin (Protestant University of App
 lied Sciences, Bochum); Claudio R. Nigg (University of Bern); and Ciara Mc
 Cormack (National University of Ireland, Maynooth)\n---------------------\
 nPanel Discussion on Serving Generative AI on HPC\n\nThis session will be 
 a panel discussion about ad hoc solutions of serving generative AI models 
 over HPC  infrastructure including best practices, minimal viable solution
 s, and ideal configurations in research contexts.\n\n\nSukanya Nath (Unive
 rsity of Bern, Data Science Lab)\n---------------------\nUNIBE's GPUStack 
 as an Example: Reporting on User needs\n\nThe University of Bern has devel
 oped a proof-of-concept platform using GPUstack technology, providing a se
 cure, institutionally controlled environment for deploying generative AI m
 odels. Designed for researchers working outside commercial cloud infrastru
 ctures, it enables direct management of comput...\n\n\nTobias Hodel (Unive
 rsity of Bern)\n---------------------\nLLM Infrastructure on HPC: Workflow
 s, Constraints, and Solutions\n\nThe integration of Large Language Models 
 (LLMs) into academic research is severely constrained by data privacy regu
 lations. Researchers handling sensitive, GDPR-protected data cannot utiliz
 e commercial cloud APIs, necessitating the local deployment of open-weight
  LLMs on High-Performance Computing (...\n\n\nAhmad Alhineidi (University 
 of Bern, Data Science Lab)\n\nDomain: Chemistry and Materials, Climate, We
 ather, and Earth Sciences, Applied Social Sciences and Humanities, Enginee
 ring, Life Sciences, Physics, Computational Methods and Applied Mathematic
 s\n\nSession Chairs: Tobias Hodel (University of Bern, Switzerland) and Su
 kanya Nath (University of Bern, Data Science Lab)
END:VEVENT
END:VCALENDAR
