About ExBLAS


ExBLAS stands for Exact (fast, accurate, and reproducible) Basic Linear Algebra Subprograms.

The increasing power of current computers enables one to solve more and more complex problems. This, therefore, requires to perform a high number of floating-point operations, each one leading to a round-off error. Because of round-off error propagation, some problems must be solved with a longer floating-point format.

As Exascale computing is likely to be reached within a decade, getting accurate results in floating-point arithmetic on such computers will be a challenge. However, another challenge will be the reproducibility of the results -- meaning getting a bitwise identical floating-point result from multiple runs of the same code -- due to non-associativity of floating-point operations and dynamic scheduling on parallel computers.

ExBLAS aims at providing new algorithms and implementations for fundamental linear algebra operations -- like those included in the BLAS library -- that deliver reproducible and accurate results with small or without losses to their performance on modern parallel architectures such as Intel Xeon Phi many-core processors and GPU accelerators. We construct our approach in such a way that it is independent from data partitioning, order of computations, thread scheduling, or reduction tree schemes.


People



Contact


We would appreciate any feedback, bug reports, suggestions, or comments that you might have to improve the product. Please email us at:


Download


  • Compressed sources (tar.gz). Current version includes ExSUM, ExDOT, ExGEMV, ExTRSV, and ExGEMM
  • Documentation (html) (pdf)
  • ExBLAS is distributed under the Modified BSD license (LICENSE )


Publications


  • R.Iakymchuk, S.Collange, D.Defour, and S.Graillat. "ExBLAS: Reproducible and Accurate BLAS Library". In the Proceedings of the Numerical Reproducibility at Exascale (NRE2015) workshop held as part of the Supercomputing Conference (SC15). Austin, TX, USA, November 15-20, 2015. HAL ID: hal-01202396. (pdf)
  • S.Collange, D.Defour, S.Graillat, and R.Iakymchuk. "Numerical Reproducibility for the Parallel Reduction on Multi- and Many-Core Architectures". Parallel Computing. 2015. DOI: http://dx.doi.org/10.1016/j.parco.2015.09.001. HAL ID: hal-00949355, version 4. (pdf)
  • R.Iakymchuk, D.Defour, S.Collange, and S.Graillat. "Reproducible and Accurate Matrix Multiplication for GPU Accelerators". Lecture Notes of Computer Science (To appear). HAL ID: hal-01102877. (pdf)
  • R.Iakymchuk, S.Graillat, S.Collange, and D.Defour. "ExBLAS: Reproducible and Accurate BLAS Library". Poster at the 7ème Rencontre Arithmétique de l'Informatique Mathématique (RAIM 2015), Rennes, France, April 7-9, 2015. HAL ID: hal-01140280. (pdf )
  • R.Iakymchuk, D.Defour, S.Collange, and S.Graillat. "Reproducible Triangular Solvers for High-Performance Computing". In the Proceedings of the 12th International Conference on Information Technology: New Generations (ITNG 2015), Special track on: Wavelets and Validated Numerics, April 13-15, 2015, Las Vegas, Nevada, USA. (To appear). HAL ID: hal-01116588, version 2. (pdf)
  • S.Collange, D.Defour, S.Graillat, and R.Iakymchuk. "Reproducible and Accurate Matrix Multiplication for High-Performance Computing". In Book of Abstracts of the 16th GAMM-IMACS International Symposium on Scientific Computing, Computer Arithmetic and Validated Numerics (SCAN'14). Würzburg, Germany, September 21-26, 2014. (pdf )
  • S.Collange, D.Defour, S.Graillat, and R.Iakymchuk. "A Reproducible Accurate Summation Algorithm for High-Performance Computing". In Proceedings of the SIAM Workshop on Exascale Applied Mathematics Challenges and Opportunities (EX14) held as part of the 2014 SIAM Annual Meeting. Chicago, Il, USA, July 6--11, 2014. (pdf )
  • S.Collange, D.Defour, S.Graillat, and R.Iakymchuk. "Full-Speed Deterministic Bit-Accurate Parallel Floating-Point Summation on Multi- and Many-Core Architectures". HAL ID: hal-00949355. February 2014. (pdf)


Talks:


  • R.Iakymchuk. "ExBLAS: Reproducible and Accurate BLAS Library". At the Numerical Reproducibility at Exascale (NRE2015) held as a part of the Supercomputing Conference (SC15). Austin, TX, USA, November 20th, 2015. (pdf )
  • R.Iakymchuk. "Reproducibility and Accuracy for High-Performance Computing". At the 7ème Rencontre Arithmétique de l'Informatique Mathématique (RAIM 2015), Rennes, France, April 7-9, 2015. (pdf )
  • R.Iakymchuk. "Reproducible and Accurate BLAS routines towards ExaScale Computing". At the Pequan Seminar, LIP6, Sorbonne Universités, UPMC Univ Paris 06, January 15th, 2015. (similar to the talk at Aric) (pdf )
  • R.Iakymchuk. "Numerical Reproducibility of BLAS routines towards ExaScale Computing". At the Aric Seminar, LIP, ÉNS Lyon, December 11th, 2014. (pdf )
  • R.Iakymchuk. "Reproducible and Accurate Algorithms for ExaScale Computing". At the ICS Matinée Jeunes Chercheurs, Sorbonne Universités, UPMC Univ Paris 06, November 5th, 2014. (pdf )
  • S.Collange, D.Defour, S.Graillat, and R.Iakymchuk. "Reproducible and Accurate Matrix Multiplication in ExBLAS for High-Performance Computing". At the SCAN 2014, 16th GAMM-IMACS International Symposium on Scientific Computing, Computer Arithmetic and Validated Numerics. Würzburg, Germany, September 21-26, 2014. (pdf )
  • S.Collange, D.Defour, S.Graillat, and R.Iakymchuk. "A Reproducible Accurate Summation Algorithm for High-Performance Computing". At the SIAM Workshop on Exascale Applied Mathematics Challenges and Opportunities (EX14). Chicago, Il, USA, July 6, 2014. (pdf )


Acknowledgements


  • Thanks to Chemseddine Chohra (UPVD) for the feedback.
  • This work undertaken (partially) in the framework of CALSIMLAB is supported by the public grant ANR-11-LABX-0037-01 overseen by the French National Research Agency (ANR) as part of the ``Investissements d'Avenir'' program (reference: ANR-11-IDEX-0004-02).
  • This work was granted access to the HPC resources of The Institute for scientific Computing and Simulation financed by Region Île-de-France and the project Equip@Meso (reference ANR-10-EQPX-29-01) overseen by the French National Research Agency (ANR) as part of the ``Investissements d’Avenir'' program.
  • This work was also supported by the FastRelax project through the ANR public grant (reference: ANR-14-CE25-0018-01)