Can Different Access Patterns Improve a Matrix's Condition Number?

  • Context: Graduate 
  • Thread starter Thread starter nurfherder
  • Start date Start date
  • Tags Tags
    Condition Matrix
Click For Summary

Discussion Overview

The discussion revolves around the condition number of symmetric positive-definite matrices and its potential improvement through different access patterns in iterative solvers. Participants explore the implications of accessing matrices in column-major versus row-major order and the performance differences observed when running iterative algorithms like Jacobi on GPUs compared to CPUs.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant questions whether changing the access pattern of a matrix in iterative solvers can improve its condition number, specifically comparing column-major and row-major orders.
  • Another participant asserts that the order of operations will not affect the condition number or the Eigenvalue Spectral Radius.
  • Several participants discuss the unexpected observation that iterative algorithms like Jacobi converge in fewer iterations on GPUs than on CPUs, despite using the same model and tolerance.
  • One participant suggests that differences in convergence may arise from compiler optimizations, bugs, or variations in floating-point arithmetic across platforms.
  • A later reply indicates that round-off errors in GPU calculations may inadvertently benefit convergence rates, contrasting with the higher precision of CPU calculations.
  • Another participant recalls a historical anecdote about precision issues in CFD software, drawing a parallel to current GPU performance discussions.

Areas of Agreement / Disagreement

Participants express differing views on whether access patterns can influence the condition number, and there is no consensus on the reasons behind the observed differences in convergence rates between GPU and CPU implementations.

Contextual Notes

Participants note potential limitations related to the precision of calculations and the impact of round-off errors, but these aspects remain unresolved within the discussion.

Who May Find This Useful

This discussion may be of interest to researchers and practitioners in numerical methods, computational physics, and those exploring the performance of iterative solvers on different hardware architectures.

nurfherder
Messages
4
Reaction score
0
Hello all,

I am new to this forum but am glad I found it, I have a quick question about condition numbers and order of operations.

Given a symmetric positive-definitive matrix with a initial condition number α, is it possible to improve that condition number with a different access pattern? For example, if I access the matrix within the context of an iterative solver (e.g., Jacobin) in column-major order would it improve the condition number over access done in row-major order?

I am doing some personal research into iterative solvers and convergence rates and I would like to know if the condition number can be improved, thus lower the total number of iterations to converge, significantly with something so small.

Thank you.
 
Physics news on Phys.org
Never mind about my initial post, the order of operations as applied to column major versus row major will have no effect on the condition number of the matrix. The access order should not effect the Eigenvalue Spectral Radius.

Does anyone have any clue as to why a iterative algorithm, such as Jacobin, would have less iterations to converge when executed on the GPU versus the CPU? The model and tolerance is exactly the same in both cases, so I cannot understand how the GPU has less iterations using a Krylov search space. I have executed the SAME code for CPU and GPU (except of course the CPU has NO calls to the GPU) on two different sets of CPUs and GPUs (one double precision - Tesla, and one not - Quadro) and get exactly the same result.

Any ideas would be great, I think I might have broke one of my neurons on this one.

Thanks.
 
nurfherder said:
Does anyone have any clue as to why a iterative algorithm, such as Jacobin, would have less iterations to converge when executed on the GPU versus the CPU? The model and tolerance is exactly the same in both cases, so I cannot understand how the GPU has less iterations using a Krylov search space.

That doesn't make any sense to me. If you do the EXACT same operations, you should get exactly the same results.

The explanation may be something to do with compiler optimisation, compiler bugs, library routines, different implementations of floating point arithmetic, etc. The only way to nail that is compare your two calculations step by step. If there are differences, they will probably show up (if only in the last decimal place) on small matrices as well as on big ones.
 
You are right - it doesn't make sense to me either. I was just wondering if there was an obvious and therefore easy reason.

Thank you for your help and time.
 
I found the problem.

Turns out that the GPU typically has some round-off error that will benefit the GPU for iterative solvers such that the higher precision of the CPU will take more iterations to converge. The small inaccuracies of the GPU become magnified when doing large sets of summations - such as those found in the Dot-product of the iterative solver I am using.

It is sneaky and is typically solved by using double precision (CUDA arch. 1.3 or greater) or algorithmically with the Kahan approximation.
 
nurfherder said:
Turns out that the GPU typically has some round-off error that will benefit the GPU for iterative solvers such that the higher precision of the CPU will take more iterations to converge.

Hm... long before the days of GPUs, I remember a CFD software guru trying to convince me that his code worked better in 32 bit precision arithmetic than in 64 (In fact it didn't work at all in 64, on most problems).

Maybe he gave up trying to sell his CFD software and went into GPU design ... :smile:
 

Similar threads

  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 1 ·
Replies
1
Views
7K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K