MATLAB Extract and rearrange data in Matlab

AI Thread Summary
The discussion focuses on rearranging data from an Excel file transformed into a .mat format using MATLAB. The user seeks to sum values from specific columns based on conditions from other columns, calculate means, and handle NaN values appropriately. A sample input matrix is provided, along with the desired output format, which includes sums of animal counts and mean ice cover for different transects. A MATLAB script is shared that processes the data, summing and averaging as required while converting NaN values to zeros for calculations. The user expresses gratitude for assistance and plans to test the provided solution.
geoffrey g
Messages
6
Reaction score
0
Hi all,

I need to rearrange my data. I have an excel file with 12 columns and 2045 rows, which I transformed that into a .mat.

I need to query my data rearranged as follows (cf. example matrix below):

1-sum all values of Col 3 if:
a-they belong to a given value of Col 1 and to a given value in Col 2.
2-mean of values of Col 4 for each given value of Col 1

3-NaN's where conditions above are not met
4-all saved in a new matrix

The difficulty is to make this work for the 2045 rows of my database. I guess it needs a for-loop that reiterates the calculations and at each step saves the values in a matrix.

Here is an example:

input matrix A:

1 NaN NaN 20
1 2 4 10
1 2 3 15
1 NaN NaN 10
1 5 2 25
1 6 1 10
1 2 5 5
2 NaN NaN 5
2 1 3 10
2 1 2 25
3 6 5 15
3 4 3 10
3 3 1 30

This is what I'd like to get:

1 2 3 4 5 6 Mean
1 NaN 12 NaN NaN 2 1 13.57
2 5 NaN NaN NaN NaN NaN 13.33
3 NaN NaN 1 3 NaN 5 18.33

I'd really appreciate your help and will include you in the acknowledgments of the 2 papers I will get out of this.

Thanks a lot!

g
 
Physics news on Phys.org
(hope) these are nicer matrices:

Input:

1 NaN NaN 20
1 2 4 10
1 2 3 15
1 NaN NaN 10
1 5 2 25
1 6 1 10
1 2 5 5
2 NaN NaN 5
2 1 3 10
2 1 2 25
3 6 5 15
3 4 3 10
3 3 1 30

desired output:

1 2 3 4 5 6 mean
1 NaN 12 NaN NaN 2 1 13.57
2 5 NaN NaN NaN NaN NaN 13.33
3 NaN NaN 1 3 NaN 5 18.33
 
Can you be more specific? The example you gave wasn't very clear, but this should be a simple problem to overcome. For ex, the first row in your "desired output" has 6 values, and the rest have 8..
 
Hi Kreil,

Sorry for the bad format...
Let me know if the attachments work...

Here is what my data means in real life:

Input matrix
Rows are individual observations
Col 1 = transect number
Col 2= species code
Col 3= count of animals of corresponding species
Col 4= % ice cover for that specific observation

Ouput:

Col 1= transect number
Col 2 to Col 7 = sum of animal counts of species 1 to species 6
Col 8= mean ice cover for corresponding transect

N=NaN

Thanks!

g
 

Attachments

  • Input matrix.png
    Input matrix.png
    3.6 KB · Views: 665
  • Desired output matrix.png
    Desired output matrix.png
    5.2 KB · Views: 731
Last edited:
I'll write up an m-file that works on that data, but you'll probably need to edit it to make it work with your actual data (since it has more columns and possibly more than 3 transects)
 
thanks so much Kreil!
:-)
g
 
I wrote this up quickly so there are likely ways you can improve it. For the time being, it returns your desired output.
Code:
%sample data matrix, A
A = [1 NaN NaN 20;1 2 4 10;1 2 3 15;1 NaN NaN 10;1 5 2 25;1 6 1 10;1 2 5 5;2 NaN NaN 5;2 1 3 10;2 1 2 25;3 6 5 15;3 4 3 10;3 3 1 30];

%Change NaN's in A to zeros for summing purposes
A(isnan(A))=0;

%Change these vars to match real counts
Transects = 3;
Species = 6;

%Preallocate output
Out = zeros(Transects,8);

for i = 1:Transects
    %First column is just transect number
    Out(i,1) = i;
    %B is the subset of A for current transect
    B = A(A(:,1)==i,:);
    %C is the subset of B for current species
    for k = 1:Species
        C = B(B(:,2)==k,:);
        Out(i,k+1) = sum(C(:,3));
    end
    %Ice cover means
    Out(i,end) = mean(B(:,4));
end

%Convert the zeros back to NaN's
Out(Out==0)=NaN;

Since a numeric matrix in MATLAB cannot have an empty entry or a char like 'mean', this output is identical to your desired output without the first row. You'll just have to remember that column 2 is species 1, column 3 is species 2, etc...
 
fantastic, Kreil! I'll try it out on monday and will let you know!
g
 
Kreil, please drop me a line:
cortoeldemalta at yahoo dot com dot ar
geoffrey
 

Similar threads

Replies
2
Views
1K
Replies
4
Views
2K
Replies
10
Views
3K
Replies
9
Views
4K
Replies
1
Views
2K
Replies
4
Views
1K
Replies
4
Views
1K
Back
Top