Conveners
Lightning Talks: Block 1
- Anna Reinicke-Vogt (Universitätsklinikum Hamburg-Eppendorf)
Lightning Talks: Block 2
- Anna Reinicke-Vogt (Universitätsklinikum Hamburg-Eppendorf)
This poster presents the final iteration of the CaloClouds series. Simulation of photon showers in the granularities expected in a future Higgs factory is computationally challenging. A viable simulation must capture the find details exposed by such a detector, yet be substantially faster than MCMC methods. The Caloclouds model utilises point cloud diffusion and normalising flows to replicate...
The "Digital Edition Levezow Album" project is an interdisciplinary collaboration between the Hub of Computing and Data Science (HCDS), the Department of Art History at the University of Hamburg, and the State and University Library Hamburg. The project aims to digitally process and interactively visualize a previously unexplored sketchbook from the late 17th century, containing drawings on...
The ELECTRODE package is a module in the official release of the molecular dynamics code LAMMPS and implements the constant potential method and related methods. Utilizing the massively parallel architecture of LAMMPS with neighbor lists and fast Fourier transforms, the package efficiently calculates interactions between atoms and minimizes their energy as a function of atom...
We introduce EncouRAGe, a comprehensive Python-based framework designed to streamline the development and evaluation of Retrieval-Augmented Generation (RAG) systems using local Large Language Models (LLMs). Encourage integrates leading tools such as vLLM for efficient inference, Jinja2 for dynamic prompt templating, and MLflow for observability and performance tracking. It supports both...
We present a privacy-preserving research environment integrating offline Large Language Models (LLMs), AI agents, and scalable infrastructure. By deploying private LLMs via Ollama and containerized workflows on Kubernetes, researchers can automate tasks like literature review, code generation, and secure data processing without compromising sensitive information. AI agents—coordinated through...
The Result Assessment Tool (RAT) is a Python-based software toolkit that addresses the critical research challenge of accessing and analyzing data from various search systems. It uses several computational methods, including Selenium for robust web scraping, Flask for the web interface, PostgreSQL for data management, and automated classifiers for content analysis. With RAT, researchers can...
Resources for research on sign languages are rare and can often be difficult to locate. Few centralised sources of information exist. The Sign Language Dataset Compendium helps by providing an overview of existing lexical resources and linguistic corpora, as well as summary of popular data collection tasks shared among corpora. To date it covers resources for 82 different sign languages. The...
Benchmarking applications in high-performance computing (HPC) systems is essential for optimising runtime, reducing energy consumption, and ensuring efficient hardware utilisation. However, accessing and interpreting performance metrics can be challenging and error prone. To address this, we present xbat (extended benchmarking automation tool), developed by MEGWARE Computer Vertrieb und...
We present MENTO, a data processing toolkit that remotely runs external analysis software on-demand using the DESY high-performance computing (HPC) cluster.
MENTO is set up to require no input from users except to point to the desired analysis software, and the entire processing pipeline is then managed automatically, including data input, access to the HPC cluster, job submissions to a...
Computational pathology has made tremendous progress on dedicated datasets in the past years. However, currently such algorithms are still not used routinely for diagnostics in the clinics. There is still a large gap between research and clinics and the factors that contribute to this, such as the focus on reproducing subjective scores and the large variance in performance depending on the...
Continuous Integration and Continuous Deployment is a modern Software Engineering best practice that enables efficient large scale software development and use. There are a variety of popular Ci/CD tools that help in adopting these practices. In this poster we focus on the kinds of software, their runtime environments,packaging and deployment tools and techniques used at DESY that can easily...
The presentation will introduce a GraphRAG-based approach to research data retrieval from research data catalogues, using the Text+ Registry as an example.
Retrieval-Augmented Generation (RAG) systems have become a cornerstone for LLM-based question-answering tasks involving individual (potentially private or sensitive) unstructured data. However, traditional RAG pipelines often lack an...
Schematron is an ISO-standardized validation language for structured data (ISO/IEC 19757:3). It lets you evaluate assertion tests for selected parts of a document. It was first designed as an international standard in 2006 and has been updated continuously. The standardization process of the 4th edition is in its final stages and is expected to finish in September this year.
Schematron's...
The optical flow method is one of the emerging approaches for Digital Volume Correlation (DVC) to analyze the volumetric deformation during in situ experiments of material science research. However, deep optical flow neural networks for DVC are limited by memory requirement, especially for high volumetric resolution data from Synchrotron Radiation Computed Tomography (SRCT) in the scale of...
X-ray near-field holography is a full-field phase-sensitive microscopy method. It allows to image specimen with a single exposure in a scalelable field of view. The measurements are so called holograms and require reconstruction to obtain the actual image of the specimen. The reconstruction is the bottleneck of this method. It can be time consuming and algorithm parameters need to be tuned...
For many people, the media are the main source of information about climate change. An increasing number of people have turned to online services from both traditional and new media providers to stay informed. As a result, studying online reporting is essential to understand how public debates about climate change are shaped. To support this, the University of Hamburg developed the Online...
The Data Hub is an open source software framework created to address the needs of collaborative research using diverse data across disciplines. It is developed in Python, on top of the Django web-framework and a PostGIS/PostgreSQL database, following computer science best practices as well as the FAIR4RS principles.
The framework’s core function allows reproducible...