Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Matlab help: indexing in for loops

  1. Sep 25, 2010 #1
    Hello!

    I consider myself typically a confident Matlab user. But I have encountered a new problem that I have no idea how to solve, so I'm hoping I can find some help.

    I have set up a for loop in my code to the effect of

    for i = 1:n
    if MyArray(i,1) < some value

    perform other calculations

    end ​

    end

    The code runs fine for small values of n. But, when I have a very large n (somewhere near 50,000), I get an error message:
    ??? Attempted to access MyArray(NaN,1); index must be a positive integer or logical.
    And, checking on the value of the index i, it has become NaN.

    I don't understand how this is possible as Matlab should be incrementing i itself within the for loop. (I promise I'm not messing with the index inside the for loop!) I thought it might be a problem with the variable type of i, since I think values stored as integers have a relatively small limit, but it looked like i was stored as type long, which should be able to hold much larger than 50,000.

    Does anyone have any suggestions? Your help is much appreciated!

    -S
     
  2. jcsd
  3. Sep 26, 2010 #2
    If you aren't touching i inside the loop then this looks really strange. Can you provide the code / MATLAB build so I can try to replicate the results? Perhaps cut down the code inside the loops to some simple operation which still produces the error message if you don't want to post the entire thing?
     
  4. Sep 26, 2010 #3
    I have tried to create a simpler version of my code for posting. What this code does (or is supposed to do):
    • First, create a matrix (nrpts x 2) that contains my data (for this test code, just a bunch of random numbers). Next, create a nrpts x 1 array that contains the series 1, 2, 3, 4,... etc (this data needed later so it really doesn't serve much purpose in the example code).
    • I have a pair of nested for loops to go through the data held in objectivePoints. The goal is to compare each pair of data points objectivePoints(i,:) to each other. I am trying to compare a point (a1, b1) to (a2, b2), and if both a1 < a2 and b1 < b2, set the index of b2 to zero in pointIndex.
    • Also, it's not meaningful to compare a pair to itself so there is an if check on count1 ~= count2.

    I ran this code a few times, and didn't get an error every time, maybe half of the time. The error I get is (with various numbers instead of 63313):
    ??? Attempted to access objectivePoints(63313,-2.14748e+009); index must be a positive integer or logical.
    which really makes no sense, considering that I am only looking at objectivePoints(*,1) or (*,2). I occasionally get this error instead of the NaN error on my original code as well.

    I am totally stumped! Please do let me know if you also encounter this error. Thanks! (Also, there is probably a far more efficient way to approach the comparison problem, so if you happen to have any great ideas on a better way to do that, let me know, since it is clearly a bit slow as I have it set up currently.)




    clear all

    % Number of points tested
    nrpts = 1e5;

    % Create example data set
    objectivePoints = zeros(nrpts,2);
    objectivePoints = rand(size(objectivePoints));

    for k = 1:nrpts
    pointIndex(k,1) = k;​
    end


    % Comparison test
    for count1 = 1:nrpts
    for count2 = 1:nrpts​
    if count1 ~= count2​
    if objectivePoints(count1,1) < objectivePoints(count2,1)​
    if objectivePoints(count1,2) < objectivePoints(count2,2)​

    pointIndex(count2,1) = 0;​
    end​
    end​
    end​
    end​
    end
     
  5. Sep 27, 2010 #4
    Ok, thanks. I have tried running the code and it is still going, but there is clearly no reason why MATLAB should be looking for a negative one billion matrix index so something is probably making it explode. I suggest you try to reformulate your algorithm to do the same calculation using fewer iterations, because you're looking up matrix entries and comparing them 10^10 times (two for loops of 100,000). This is definately why it is taking so long, and probably why your code is making MATLAB break.


    One thing which isn't fully clear is your overall objective. I am confused because you are in some way evaluating every pair of points against some criteria, and if the pair meets this criteria you record only the index of one of the points in the pair. I am guessing this code is used to find the index of the point which is closer from the origin than any other (according to the Manhatten metric), since if any point is found to be further away than any other point you set its index to zero?

    Perhaps there is more to this than I understand.



    Anyway, try preallocating your pointIndex matrix with
    pointIndex = zeros(nrpts,1);

    one line before the iteration which creates this. If you create a large matrix one row at a time it takes ages since MATLAB must create new matrices and copy the entries across, it can't just add a row on.

    Second, I suggest you look for a different way of comparing distances. If you are actually looking for the index of the closest point, you can just assign each point a Manhatten distance (x + y rather than sqrt(x^2 + y^2), because you are only allowed to travel north-south and east-west) and then find the minimum, using something like:

    Code (Text):
    mdist = zeros(nrpts,1);
    for i = 1:nrpts
        mdist(i) = objectivePoints(i,1) + objectivePoints(i,2);
    end
    [C,I] = min(mdist);
    I
    mdist is your distance from the origin and min finds the minimum and saves its value in C and its index in I.
     
  6. Sep 27, 2010 #5
    Thanks for the reply.

    I agree, this would definitely not be the best way to find the index of the closest point. The goal of the code is to locate points along the Pareto front. To illustrate, here's a diagram from Wikipedia:

    http://en.wikipedia.org/wiki/File:Front_pareto.svg" [Broken]

    All of the data points I have are the blue squares, and I'm trying to find points along the red line shown in the picture. But, as that picture shows, this code won't just return one point that is the closest, but many points. I am going through the comparison and looking at, say, point A and point C; if we see that f1(A) < f1(C) and f2(A) < f2(C), then we know that C is not on the Pareto front so I set its index to zero to remember that information.

    The problem I am having is that even with a very large number of data points, only a few will end up lying along the Pareto front. I typically end up with 20-100 points out of 50,000. Unfortunately, 20 points really isn't enough to discern the shape of the Pareto front (it ends up looking like the Wikipedia picture and not like a nice curve). This is why I have been trying to use very large numbers of points.

    Your tip about using the Manhattan distance to compare the points is interesting and I'll have to keep thinking if there is a way I can use that. But for now, the problem is that
    f1(A) < f1(B) AND f2(A) < f2(B) implies f1(A) + f2(A) < f1(B) + f2(B)​
    However, the converse isn't necessarily true.

    Anyway, after all that trouble, I think the real problem is the scale of the loop. I often get "out of memory" errors when running Matlab (on my five year old laptop), so I suspect that the errors I'm getting are really more side effects of that. I will have to keep working on improving my algorithm instead.

    Thanks for your help!
     
    Last edited by a moderator: May 4, 2017
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook