Aachen 2019 – wissenschaftliches Programm
T 59.9: Vortrag
Mittwoch, 27. März 2019, 18:00–18:15, S11
Boosting data-intensive HEP analyses by coordinating distributed caches — •Christoph Heidecker, Martin Sauter, Matthias J. Schnepf, Max Fischer, Manuel Giffels, Eileen Kühn, R. Florian von Cube, and Günter Quast — Karlsruhe Institute of Technology
The ever-growing amounts of data processed by HEP user analyses results in challenges for the network and storage infrastructure, which can be tackled by introducing local caches for recurrently accessed data.
Efficient utilization of conventional caches placed within a distributed infrastructure requires both, coordination of data placement and sending work-flows to the most suitable host in terms of data locality. The coordinated and distributed caching approach thereby reduces redundantly stored data and improves the overall processing efficiency.
Thus, the KIT developed the NaviX coordination service, which connects an XRootD caching proxy infrastructure with an HTCondor batch system. The performance improvements of our concept are currently evaluated on opportunistic compute resources as well as on the Throughput-Optimized Analysis-System (TOPAS) cluster dedicated for data-intensive HEP user analyses, which is currently commissioned at KIT.
In this contribution, we give an overview of the coordinated and distributed caching concept, performance benchmark results and experiences gained.