In Proc. 19th ACM Symposium on Parallelism in Algorithms and Architectures, pages 61-70, 2007.
We analyze the problem of sparse-matrix dense-vector multiplication (SpMV) in the I/O-model. The task of SpMV is to compute y:=Ax, where A is a sparse Nx N matrix and x and y are vectors. Here, sparsity is expressed by the parameter k that states that A has a total of at most kN nonzeros, i.e., an average number of k nonzeros per column. The extreme choices for parameter k are well studied special cases, namely for k=1 permuting and for k=N dense matrix-vector multiplication.
We study the worst-case complexity of this computational task, i.e., what is the best possible upper bound on the number of I/Os depending on k and N only. We determine this complexity up to a constant factor for large ranges of the parameters. By our arguments, we find that most matrices with kN nonzeros require this number of I/Os, even if the program may depend on the structure of the matrix. The model of computation for the lower bound is a combination of the I/O-models of Aggarwal and Vitter, and of Hong and Kung.
We study two variants of the problem, depending on the memory layout of A. If A is stored in column major layout, SpMV has I/O complexity Θ(min{kN/B·(1+logM/B (N/max{M,k})),kN}) for k≤ N1-ε and any constant 1>ε > 0. If the algorithm can choose the memory layout, the I/O complexity of SpMV is Θ(min{kN/B·(1+logM/B (N/(kM))),kN}) for k≤ N1/3.
In the cache oblivious setting with tall cache assumption M≥ B1+ε, the I/O complexity is OkN/B·(1+logM/B (N/k)) for A in column major layout.
Copyright noticeCopyright © 2007 by the Association for Computer Machinery, Inc.
Online version
spaa07.pdf (302 Kb)
DOI
BIBTEX entry
@inproceedings{spaa07,
author = "Michael A. Bender and Gerth Stølting Brodal and Rolf Fagerberg and Riko Jacob and Elias Vicari",
booktitle = "Proc. 19th ACM Symposium on Parallelism in Algorithms and Architectures",
doi = "10.1145/1248377.1248391",
isbn = "978-1-59593-667-7",
pages = "61-70",
title = "Optimal Sparse Matrix Dense Vector Multiplication in the I/O-Model",
year = "2007"
}