Numerical floating point addition

AI Thread Summary
Adding sorted numbers from smallest to largest in magnitude minimizes truncation error in floating point addition. Summing in the opposite order can lead to less accurate results due to the way floating point arithmetic handles precision. To further reduce truncation errors, a method using an array of 2048 doubles can be employed to hold intermediate sums based on the exponent of the numbers. This technique involves storing and combining numbers efficiently to avoid overflow and maintain accuracy. Overall, for optimal results in floating point addition, start with the smallest magnitudes first.
Khashishi
Science Advisor
Messages
2,812
Reaction score
490
If I have a bunch of sorted numbers spanning a large range of magnitudes, is it better to add them up from smallest to largest or from largest to smallest, or something else?

Let's say I'm summing an array A, which is sorted from large to small. Which gives a more accurate result:

sum1 = 0
for i=0,length(A)-1
sum1 += A
end

sum2 = 0
for i=length(A)-1,0
sum2 += A
end
 
Mathematics news on Phys.org
Adding the numbers in order, smallest to largest (in magnitude) is better (less truncation error).

It's also possible to further reduce truncation error by using a function and an array of 2048 doubles to hold intermediate sums where the index into the array is based on the exponent of a double precision number ( in C the index = (* (unsigned __int64 *)(&number)) >> 52) & 0x7ff; ). The array is initialized to zero, and each time a new number is to be added, the index for that number is generated. If array[index] == 0. , then the number is just stored, else number = number + array[index]; array[index] = 0.; a new index for number is generated and the process repeated until array[index] == 0 and the number is stored (or until overflow is detected). Once all numbers have been added, then the array is summed from index = 0 to index = 0x7ff to produce the sum. The purpose of this is minimize truncation.
 
Last edited:
if they're floating point, that's a no-brainer.

start with zero and add the small ones (in magnitude, negative or positive) first.
 
Thread 'Video on imaginary numbers and some queries'
Hi, I was watching the following video. I found some points confusing. Could you please help me to understand the gaps? Thanks, in advance! Question 1: Around 4:22, the video says the following. So for those mathematicians, negative numbers didn't exist. You could subtract, that is find the difference between two positive quantities, but you couldn't have a negative answer or negative coefficients. Mathematicians were so averse to negative numbers that there was no single quadratic...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Thread 'Unit Circle Double Angle Derivations'
Here I made a terrible mistake of assuming this to be an equilateral triangle and set 2sinx=1 => x=pi/6. Although this did derive the double angle formulas it also led into a terrible mess trying to find all the combinations of sides. I must have been tired and just assumed 6x=180 and 2sinx=1. By that time, I was so mindset that I nearly scolded a person for even saying 90-x. I wonder if this is a case of biased observation that seeks to dis credit me like Jesus of Nazareth since in reality...
Back
Top