Retrieving Indices of a vector whose elements fulfil a criteria

Click For Summary

Discussion Overview

The discussion revolves around optimizing the retrieval of indices from vectors and matrices in MATLAB, particularly focusing on finding the greatest values and specific conditions such as the half-width half-maximum of a Gaussian function. The scope includes technical explanations and potential optimizations in coding practices.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Exploratory

Main Points Raised

  • One participant seeks a method to find the indices of the top ten values in a vector without using a time-consuming loop.
  • Another participant suggests using MATLAB's sort function, which can return both the sorted values and their corresponding indices.
  • A participant expresses surprise at the existence of the second output of the sort function, indicating a lack of awareness despite using it frequently.
  • A follow-up question is posed regarding efficiently finding a specific row in a vector, particularly in relation to the half-width half-maximum of a Gaussian function.
  • It is noted that repeatedly calling max within a loop can significantly slow down the process, and caching the maximum value outside the loop is recommended for efficiency.
  • A suggestion is made to use the min function to find the index of the half-width half-maximum without a loop, proposing a more efficient approach.

Areas of Agreement / Disagreement

Participants generally agree on the need for optimization in the code, and multiple approaches to achieve this are discussed. However, there is no consensus on a single best method, as various techniques are proposed and explored.

Contextual Notes

Some limitations are noted, such as the dependence on the specific structure of the data and the potential for inefficiencies in the current coding practices. The discussion does not resolve the optimal method for all scenarios.

mikeph
Messages
1,229
Reaction score
18
I have a 100x1 vector A, and want to know the ten indices i for which A(i) are the greatest 10 values in A.

I can find the values by:
Code:
B = sort(A)
B(1:10);

But I don't know how to find the indices, apart from starting a new if loop which searches the entire matrix (which seems very time consuming).

As another example, I have a matrix B which is 10,000x2 and I want to know the value of B(i,1) for which B(i,2) is the maximum value in the column vector B(:,2). Eg.

Code:
>> B = [(5:0.1:5.9)',rand(10,1)]
B =
   5.000000000000000   0.696266337082995
   5.100000000000000   0.093820026774866
   5.200000000000000   0.525404403859336
   5.300000000000000   0.530344218392863
   5.400000000000000   0.861139811393332
   5.500000000000000   0.484853333552102
   5.600000000000001   0.393456361215266
   5.700000000000000   0.671431139674026
   5.800000000000001   0.741257943454206
   5.900000000000000   0.520052467390387

>> for i = 1:10
if B(i,2) == max(B(:,2))
B(i,1)
end
end
ans =
   5.400000000000000

However I think this is taking absolutely ages in my code (I need to find this 20bn times so it's definitely an area for optimisation). Is there a simpler way to find the indices?
 
Physics news on Phys.org
It looks like you are using Matlab. Matlab's sort function can return two values, the sorted array and an array of permutation indices.
Code:
[B,idx] = sort (A);
 
I didn't realize sort had that second output! I've been using it for ages as well.

Thanks for the fast response!
 
Following on from this, is there any way to quickly find a specific row in a vector?

For example, searching for the half-width half-maximum of a Gaussian

Code:
>> x = -20:0.0001:20;
>> f = 4*exp(-x.^2/3);
>> time = cputime; for i = 200000:400001
if abs(f(i) - max(f)/2) < 0.0001
x(i)
break
end
end; cputime-time
ans =
    1.4420
ans =
   45.2031

ie. I can find the value but it takes a long time. 45 seconds is ordinarily fine but I need to perform this many, many times.

Is there no "search" function?

BTW. In the example I realize that there is probably no i such that f(i) = 2 exactly. If there is no answer to this question, would there be an answer if there did exist an i where f(i) = 2?
 
In the above example, you are calling max(f) 200,001 times, each invocation of which requires searching over a list of 400,001 elements. Of course that is slow. Cache the value of max(f) outside the loop and you will see a dramatic speed bump.
 
What about instead of the loop, doing something like this (using your above definitions of f and x):

[junk, halfwidthIndex] = min(abs(f - max(f)/2));
x(halfwidthIndex)
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 24 ·
Replies
24
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K