Linear solvers are a central component of many applications in physics and engineering. In this work we present a software package for simultaneously solving with multiple right-hand sides using the vast compute performance and memory bandwidth of graphical processors. Using the transpose-free quasi minimal residual method iterative linear solving does not require the implementation of an adjoint operator. This C++/CUDA software packet has two ways of being employed. The precompiled version of this library offers linear solving for single and double precision block-sparse complex matrices with interfaces to various programming languages, in particular C, Fortran, Python and Julia. Furthermore, the core algorithm is available for custom implementations of any linear operator as a C++ header-only library. We showcase a matrix-free approach of a custom operator for a finite-difference stencil application solving the three-dimensional Helmholtz equation and compare the performance of the matrix-free approach against the block-sparse matrix version, both on NVIDIA hardware.
MSC Classification: 15-04