How to Efficiently Remove Zeros from 2D Arrays in Python?

  • Python
  • Thread starter ProPatto16
  • Start date
  • Tags
    2d Arrays
In summary: The condition tests the first row and if it returns True then deletes it. So it's always checking the first row but each time it's a new first row.The solution you proposed is to use numpy.nonzero on a middle row and column. This allows you to get the indexes of where the zeros start and stop and can convert to array and use min() max() to get the first and last indexes in both row and column direction. However, this solution doesn't work if you have a full array where you cannot delete any lines.
  • #1
ProPatto16
326
0
Hi there,

I have 1024,1024 arrays (lots of them) which are really roughly 600,800 (it changes) and then buffered by zeros all the way around something like

000000000000000000
000000000000000000
000xxxxxxxxxxxxx000
000xxxxxxxxxxxxx000
000xxxxxxxxxxxxx000
000000000000000000
000000000000000000

i want to delete all the zeros

i can use numpy.all then numpy.delete but i can only make it work with 4 while loops... essentially one from each direction and its just long and untidy.

i can empty all the 0 elements with trim_zeros but then can't find a nice neat way to delete them.

I havnt tried boolean masking yet since i need the element values.

ideas?

thanks
 
Technology news on Phys.org
  • #2
ProPatto16 said:
i can empty all the 0 elements with trim_zeros but then can't find a nice neat way to delete them.
Could you elaborate on what you mean by "a nice neat way to delete them"?
 
  • #3
Is there a simple way to delete empty elements from a 2d array where you don't have to iterate every row or column?
 
  • #4
ProPatto16 said:
Is there a simple way to delete empty elements from a 2d array where you don't have to iterate every row or column?
I don't see how that would be possible. You have to inspect each element of the array.
 
  • #5
when i do it with while loops i have

Python:
while numpy.all(array[0]==0)==True:
    Calarray=numpy.delete(array,0,0)

so it checks each line and if its all zeros then it deletes that line.

But it doesn't terminate unless i manually press return.

If i run the while condition prior to executing i get True. if i run it after I've pressed return i then get False
howcome it doenst end on its own?

im doing a while loop like this from top, bottom then transpose, do top, bottom again and transposing back.
i just thought there might be a quicker way but this works beautifully if i can get the loops to end on their own.

Thanks!
 
  • #6
ProPatto16 said:
But it doesn't terminate unless i manually press return.
If i run the while condition prior to executing i get True. if i run it after I've pressed return i then get False
howcome it doenst end on its own?
I cannot see any test for "finished", so there is no reason why it should stop.
Also, while I do not speak Python, it seems to me that you test the same array (array[0]) over and over.
 
  • #7
My understanding of while loops is that as soon as the condition is false it should terminate. I.e. The false condition is the exit.

The condition tests the first row and if it returns True then deletes it. So it's always checking the first row but each time it's a new first row.
 
  • #8
ProPatto16 said:
My understanding of while loops is that as soon as the condition is false it should terminate. I.e. The false condition is the exit.

The condition tests the first row and if it returns True then deletes it. So it's always checking the first row but each time it's a new first row.
I'm fairly knowledgeable with python, but not so with numpy.

Here's the code you showed:
ProPatto16 said:
Python:
while numpy.all(array[0]==0)==True:
    Calarray=numpy.delete(array,0,0)
I'm not sure your while loop is working the way you expect. The all() function takes up to four arguments, with only the first being required. See http://docs.scipy.org/doc/numpy/reference/generated/numpy.all.html for more information. The first argument to all() is an array, NOT a boolean expression as you have.

Your argument to all() is array[0] == 0. As I understand things, array[0] is the first row of your matrix, so it is itself a list or tuple or maybe an array -- I can't tell from the code you posted. In this expression, array[0] == 0, you are comparing an array (or list or whatever) for equality to a number, 0. I don't see how this can work. My guess is that python always evaluates array[0] == 0 to True, even if the row in question has one or more nonzero values.
 
  • #9
I gave up on that and it works fine in like 5 lines of code with basic indexing.

For anyone that's interested ...

Using numpy.nonzero on a middle row and column can get the indexes of where the zeros start and stop and can convert to array and use min() max() to get the first and last indexes in both row and column direction and just use those indexes to extract a sub array from the original. Don't need to iterate elements etc.
 
  • #10
ProPatto16 said:
Don't need to iterate elements etc.
But you can be sure that under the covers, that's exactly what is happening -- i.e., each element of a subarray is being inspected.
 
  • #11
Are you sure that your first element in the sub-array cannot be zero?
See example below:
Code:
000111000
001111100
011111110
111111111
etc
Here you have a full array where you cannot delete any lines.
You can only make the decision once you have iterated through all the lines.

What is very important here is the precondition of the array you like to iterate through, what data is expected and which cases you can drop. Your solution isn't as straight forward as you have pictured it.
In terms of improving your algorithm you could start left to right on the first row. When hitting a non-zero value you then move to the right end and iterate from right to left until you encounter a non zero value. It reduces the number of look-ups. This can be further reduced by checking only the columns which you have not yet located non-zero values for.
The first non zero line would indicate the top row and the first zero line down the array indicates the bottom row plus one.
You can then copy the rectangle from your array and you end up with a new array.
 
  • #12
Not fluent in Python, but this is the way I would do it:
Python:
length = len(array)
for x in range(0, length):
    array[x] = array[x].strip('0')

Ref.: http://www.tutorialspoint.com/python/string_strip.htm

EDIT: Nevermind, I now understand that this is not what you want to accomplish at all.
 
Last edited:
  • #13
ProPatto16 said:
when i do it with while loops i have

Python:
while numpy.all(array[0]==0)==True:
    Calarray=numpy.delete(array,0,0)

so it checks each line and if its all zeros then it deletes that line.
That won't work, for a number of reasons. One is that the condition is wrong. Another is that you aren't changing array. You can fix this by using
Python:
while not numpy.any(array[0]) :
    array = array[1:]
This however is a bad idea. Suppose the first nonzero element is on row 1000. You will be creating and throwing away 999 slices. What you should be doing is finding the index of the first and last rows that contain something other than all zeros, and then finding the first and last columns that contain something other than all zeros. Use basic indexing (not advanced indexing) to make the second part more efficient.
Python:
def nonzero_submatrix(array) :
    try :
        il = 0
        while not numpy.any(array[il,:]) : il += 1

        iu = array.shape[0]
        while not numpy.any(array[iu-1,:]) : iu -= 1

        jl = 0
        while not numpy.any(array(il:iu,jl) : jl += 1

        ju = array.shape[1]
        while not numpy.any(array(il:iu,ju-1) : ju -= 1

        return array[il:iu,jl:ju].copy()
        # Alternatively, just use return array[il:iu,jl:ju] (i.e., no copy)

    except Index Error :
        return None
Note that the above avoids making a copy until the very end (and alternatively avoids the copy altogether). It also protects against an all-zero array using EAFP ("Easier to Ask for Forgiveness than Permission").
 

1. How can I remove all 0's from a 2d array?

To remove all 0's from a 2d array, you can use a nested for loop to iterate through each element in the array. If the element is equal to 0, you can use the splice() method to remove it from the array.

2. Is there a built-in function for removing 0's from a 2d array?

No, there is not a built-in function specifically for removing 0's from a 2d array. However, there are various methods and techniques that can be used to achieve this task, such as using loops and conditional statements.

3. Can I remove only specific 0's from a 2d array?

Yes, you can specify certain conditions for removing 0's from a 2d array. For example, you can use the filter() method to remove all 0's except for those that meet a specific criteria.

4. How can I avoid altering the original 2d array while removing 0's?

To avoid altering the original 2d array, you can create a copy of the array and perform the removal operations on the copy instead. This way, the original array remains unchanged.

5. What are some potential challenges when removing 0's from a 2d array?

Some potential challenges when removing 0's from a 2d array include keeping track of the array indices, handling edge cases such as empty arrays or arrays with only 0's, and ensuring the correct elements are removed without altering the structure of the array.

Similar threads

  • Programming and Computer Science
Replies
3
Views
3K
  • Programming and Computer Science
Replies
2
Views
2K
  • Programming and Computer Science
Replies
14
Views
4K
  • Programming and Computer Science
Replies
10
Views
25K
  • Programming and Computer Science
Replies
5
Views
2K
  • Programming and Computer Science
Replies
7
Views
4K
  • Classical Physics
Replies
2
Views
766
  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
12
Views
3K
Back
Top