Bonn 2010 – wissenschaftliches Programm
T 26.9: Vortrag
Dienstag, 16. März 2010, 18:45–19:00, HG XIII
Efficiently utilizing GPUs for Lattice QCD — •Matthias Bach1,2, Olaf Kaczmarek3, Wolfang Söldner4, Christian Schmidt3, and Piotr Bialas5 — 1Frankfurt Institute for Advanced Studies — 2Institut für Informatik, Frankfurt — 3Universität Bielefeld — 4GSI — 5Jagellonian University, Krakow
Today traditional computer architectures can no longer achieve a speed up by means of higher clock speeds. Many core architectures however show a way to further increase computing power by increasing the number of floating point units on one chip, today over 100 on a single piece of silicon. A particularly cost efficient implementation of these many core architectures are GPUs, providing 1 to 2 TFlops on a single chip at a consumer level price.
The programming model on these many core architectures differs significantly from traditional architectures, especially from custom architectures like the apeNEXT. The huge difference in single and double precision performance and the different characteristics in the ratio of computational power to memory and communication bandwidth require major rethinking of the existing codes.
We have successfully implemented a NVIDIA CUDA based CG solver for calculations with a staggered Dirac operator and are actively working on solving the challenge of efficient multi node calculations and efficient CPU-GPU-cooperation.