[MATLAB] Making code faster/more efficient

BrownianMan · Nov 10, 2012

A is a sparse matrix. B is a vector. I have the following code:

Code:

for j=1:N
    B=A*B;
end;

This part of the code is inside a function which gets called about 160000 times. I ran the Profiler and this part is the bottleneck. How can I make it more efficient?

mfb · Nov 11, 2012

Does your A vary? If it is constant (or does not change every time), calculate A^n first.
Even it A is different every time, it might reduce the required time, as there could be special algorithms to calculate A^n more efficient.

BrownianMan · Nov 11, 2012

Yes, A is constant.

I tried that. But the problem is that N is 200. So it actually takes more time to calculate A^N. I was wondering if there are more efficient ways to compute A^N. I tried using [v d]=eigs(A) and v*(d.^N)*v' but it said it wasn't able to find eigenvalues with sufficient accuracy.

gneill · Nov 11, 2012

BrownianMan said:

Yes, A is constant.

I tried that. But the problem is that N is 200. So it actually takes more time to calculate A^N. I was wondering if there are more efficient ways to compute A^N. I tried using [v d]=eigs(A) and v*(d.^N)*v' but it said it wasn't able to find eigenvalues with sufficient accuracy.

So A is only constant for each call of the function (it is different each time the function is called)?

Note that you can speed up exponentiation by taking advantage of some properties of exponents. Rather than repeatedly multiplying by A 200 times, there's nothing to prevent you forming A² = A*A, then A⁴ = A²*A², then A16 = A⁴*A⁴, and so on to carry out the bulk of the multiplications. Save appropriate intermediate results to have the right ones on hand: 200 = 128 + 64 + 8.

BrownianMan · Nov 11, 2012

A depends on the inputs of the function, but inside the loop it does not change. For a given set of inputs, it is constant.

As far as taking advantage of properties of exponents, N is also an input. N=200 is what I am suppose to use to show that it gives the correct result for that specific N, but N can take on any integer value specified by the user. So I do not know N in advance.

gneill · Nov 11, 2012

BrownianMan said:

A depends on the inputs of the function, but inside the loop it does not change. For a given set of inputs, it is constant.

As far as taking advantage of properties of exponents, N is also an input. N=200 is what I am suppose to use to show that it gives the correct result for that specific N, but N can take on any integer value specified by the user. So I do not know N in advance.

Okay, but any N can be decomposed into a sum of powers of 2 in a similar fashion.

BrownianMan · Nov 11, 2012

Is there a function that allows you to do that in Matlab?

gneill · Nov 11, 2012

BrownianMan said:

Is there a function that allows you to do that in Matlab?

Dunno. I'm not familiar with the Matlab function library. But it's the same as converting the decimal number N to binary. Each bit then represents a power of two.

BrownianMan · Nov 11, 2012

gneill said:

Dunno. I'm not familiar with the Matlab function library. But it's the same as converting the decimal number N to binary. Each bit then represents a power of two.

Ok, I wrote a function to do that and then replaced the slow loop with this:

Code:

vec=sumPowersOf2(N); %vector of elements of powers of 2 that sum to N
AN=eye(N_space-1);
for i=1:numel(vec)
   AN=AN*A^vec(i);
end

But this is actually slower.

gneill · Nov 11, 2012

The idea would be to reuse already computed powers, not compute each power of two individually. For example, if you've already computed A^2 and you need A^4, take the already computed value of A^2 and multiply it by itself...

For A^200 you'd be looking at 200 = 2^7 + 2^6 + 2^3, so

a2 = A*A
a4 = a2*a2
a8 = a4*a4
a16 = a8*a8
a32 = a16*a16
a64 = a32*a32
a128 = a64*a64

Then A^200 is given by a8*a64*a128

So a total of 9 matrix multiplies. Yes, there will be a tradeoff in terms of space -- keeping around a lot of partial results which might be large matrices will up the memory usage. You might be able to reuse the temporary variables which are not required for the final calculation. So in the above, the a2 variable is not required after a4 is calculated, and a4 is not required again after a8 is calculated, and so on. Just hang onto the ones you need at the end: a8, a64, a128.

mfb · Nov 11, 2012

This should be quick:

N is stored as binary in your program anyway, for example as 11001000.
In addition, define two places to store matrices, which I call "power of A" and "result" here. Store A in "power of A" and 1 (the identity matrix) in "result".

Now, loop over the bits of N:
If the last bit is 1 (if 1 AND N is true, where AND is done bitwise), multiply "result" with "power of A". Otherwise, ignore that step.
Multiply "power of A" with "power of A".
Shift N by one bit (n=n/2, n>>1 or whatever MATLAB wants)
if N==0, quit, otherwise loop again

For N=200 as example, the results of the loop look like that:

N=11001000, "power of A"=A, "result"=1
(1 AND N)==0, do not change result, square "power of A", shift N...
N=1100100, "power of A"=A^2, "result"=1
... same as above
N=110010, "power of A"=A^4, "result"=1
N=11001, "power of A"=A^8, "result"=1
(1 AND N)==0, multiply "result" by "power of A", square "power of A", shift N...
N=1100, "power of A"=A^16, "result"=A^8
(1 AND N)==0, ...
N=110, "power of A"=A^32, "result"=A^8
N=11, "power of A"=A^64, "result"=A^8
N=1, "power of A"=A^128, "result"=A^72
N=0, "power of A"=A^256, "result"=A^200
Done. 10 non-trivial matrix multiplications required.

You can skip the last change of "power of A" with a bit more programming logic.

[MATLAB] Making code faster/more efficient

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Engineering Joint and Marginal Distributions of a Randomly Selected Test Answer

Engineering Drawing Acceleration Diagram with Coriolis Acceleration

Engineering Shear Stress Question (Rocker Arm & pin diameter)

Engineering Enumeration vs Bisection Strategy

Engineering Kinematic study of connected rods

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect