Würzburg 2018 – wissenschaftliches Programm
T 16.4: Vortrag
Montag, 19. März 2018, 16:45–17:00, Z6 - SR 2.006
Advantages of caching concepts for HEP analysis work-flows — •Christoph Heidecker, Matthias Schnepf, Max Fischer, Manuel Giffels, and Günter Quast — Karlsruher Institut für Technologie, Karlsruhe, Deutschland
Current experiments in High Energy Physics deliver tremendous amounts of data waiting for further processing. This leads to enormous challenges for the storing systems, but also for data distribution to end-users for further analyses. The situation is even compounded by the fact that HEP trends to utilize opportunistic resources as extension to common HEP computing facilities. For an efficient utilization of these resources an adequate data throughput of I/O intensive analyzes is essential. Data locality concepts that direct job to a processing unit holding necessary data in its local cache promise to solve those throughput limitations.
At KIT, two different caching concepts have been studied to enable short turn around cycles of I/O intensive analyses. Both concepts have been transparently integrated into the batch system HTCondor. The first approach utilizes coordinated caches on SSDs in the worker nodes and an HTCondor batch system that schedules jobs taking into account data locality. Another approach utilizes CEPH as a distributed file system acting as a system-wide cache. In combination with XRootD caching and data locality plug-ins, this approach is very well suited to tackle bandwidth limitations on opportunistic resources like HPC centers offering parallel file systems. In this talk, both caching concepts and the current development status are presented.