For over ten years recursion and New Data Structures (NDS) has been used to increase the performance of Dense Linear Algebra (DLA) factorization algorithms. For about four years now almost all computer manufacturers have dramatically changed their computer architectures which they call Multi-Core, (MC). It turns out that these new designs give poor performance for the traditional designs of DLA libraries such as LAPACK and ScaLAPACK. Recent results of Jack Dongarra's group at the Innovative Computing Laboratory in Knoxville, Tennesee and many other researchers have shown how to obtain high performance for DLA factorization algorithms on the Cell architecture, an example of an MC processor, but only when they used NDS. In this talk we will give some reasons why this is so. We concentrate on the unsolved problem of transforming in-place between NDS and the two standard data structures of DLA, namely, Column Major (CM) or Row Major (RM) array order and packed format arrays formats for symmetric and triangular arrays. We show that fast solutions to this problem exist. The importance of this work allows existing and current level three LAPACK and ScaLAPACK codes to obtain the benefits of the new DLA codes being developed for MC processors.
Biography
Dr. Gustavson manages the Algorithms and Architectures group in the Mathematical Sciences Department at the IBM Thomas J. Watson Research Center. He received his B.S. degree in physics, and his M.S. and Ph.D. degrees in applied mathematics, all from Rensselaer Polytechnic Institute. He joined IBM Research in 1963. One of his primary interests has been in developing theory and programming techniques for exploiting the sparseness inherent in large systems of linear equations. Dr. Gustavson has worked in the areas of nonlinear differential equations, linear algebra, symbolic computation, computer-aided design of networks, design and analysis of algorithms, and programming applications. He and his group are currently engaged in activities that are aimed at exploiting the novel features of the IBM family of RISC processors. These include hardware design for divide and square root, new algorithms for POWER2 for the Engineering and Scientific Subroutine Library (ESSL) and for other math kernels, and parallel algorithms for distributed and shared memory processors. Dr. Gustavson has received an IBM Outstanding Contribution Award, an IBM Outstanding Innovation Award, an IBM Invention Achievement Award, two IBM Corporate Technical Recognition Awards, and a Research Division Technical Group Award. He is a Fellow of the IEEE.
This lecture is part of the Department of Electrical and Computer Engineering Distinguished Lecture Series. Information about upcoming lectures can be seen at http://www.ece.rutgers.edu/lecture/.
Refreshments will be served at 10:30 a.m.