MS2C – Serving Inference: Leveraging HPCs in the Age of Generative AI
Event Type
Minisymposium
Chemistry and Materials
Climate, Weather, and Earth Sciences
Applied Social Sciences and Humanities
Engineering
Life Sciences
Physics
Computational Methods and Applied Mathematics
TimeMonday, June 2916:00 – 18:00 CEST
LocationBldg. 6 – Room 004
DescriptionGenerative AI is reshaping how scientific outputs are produced, yet most widely used tools are operated by commercial providers whose practices around processing, generating, and storing user inputs are often unclear. This lack of transparency raises serious concerns for research organizations handling sensitive or regulated data and complicates the responsible use of even non-regulated scientific content. At the same time, many commercially served models are closed-source and subject to frequent, opaque updates, limiting reproducibility and undermining alignment with FAIR principles. As open-weight and open-source models proliferate, HPC infrastructures are emerging as a promising alternative for hosting and providing controlled access to AI within research environments. However, most HPC systems were not designed for continuous, GPU-based inference services, creating technical and operational challenges spanning deployment, scheduling, reliability, user access, and governance. Consequently, institutions are developing ad hoc solutions with limited opportunities to exchange patterns and lessons learned. This minisymposium convenes a panel of institutions actively building such capabilities to compare approaches and discuss best practices, minimal viable solutions, and ideal configurations for serving generative AI on HPC. To broaden participation, we will use a short questionnaire to structure contributions across four dimensions: technical setup, usage policies, documentation practices, and monitoring/oversight mechanisms.
Presentations



