MATLAB Creating a Histogram with Data from a File.dat in MATLAB

AI Thread Summary
The discussion revolves around creating a MATLAB program to read data from a file and generate a histogram. The user is attempting to eliminate the value -9.9 from their dataset, initially using the command R1(R1>-9.9), which does not work as intended. The correct approach is to reassign R1 with R1(R1>-9.9) to filter out the unwanted values. Another suggestion was to replace -9.9 with NaN, but this led to unexpected results in the histogram.The user is also facing issues with the histogram output, noting discrepancies in the expected values and the appearance of empty bins. The problem is traced back to the way bin edges are defined using the colon operator, which can introduce floating-point inaccuracies. This affects how data points are assigned to bins, particularly those that lie on the edges, such as 0.
quasarLie
Messages
50
Reaction score
0
Hello everyone, I'm trying to make a MATLAB program which read a file.dat and then do a histogram
This what I did

Matlab:
Data2=importdata('Ma.DAT');
R1=Data2.data(:,17)
R1(R1>-9.9)
L = 0:0.1:8;
histc(R1,L)
bar(L,histc(R1,L),'histc')
xlabel('R1')
ylabel('counts')
[FONT=PT Sans, san-serif]I want to eliminate all the number = -9.9 present in my clomn (R1), How can i do that? i tried this R1(R1>-9.9) but it doesn't work :p can anyone help me please?
Thanks
 
Physics news on Phys.org
quasarLie said:
I want to eliminate all the number = -9.9 present in my clomn (R1), How can i do that? i tried this R1(R1>-9.9) but it doesn't work :p can anyone help me please?
Thanks

Try

R1 = R1(R1>-9.9) % Assign the new, "fixed" data back to R1

or, if you want to preserve the column indices,

R1(R1>-9.9) = NaN % These shouldn't appear in the histogram
 
Tnanks, the first one works, but the second one gives only value = -9.9 and Nan
 
quasarLie said:
Tnanks, the first one works, but the second one gives only value = -9.9 and Nan

Sorry, I meant R1(R1==-9.9)=NaN.
 
Thanks, the code doesn't give me the right x value et y value, do you have any idea why?
 
If you eliminate the R1 values that don't fit your criteria, but not the corresponding L values, then they won't line up.

If you have a picture or a more detailed description, that might help us understand the output you're getting.
 
It give me this histogram, which is not correct because i have 240 objects and B1 have value between 0 and 3.8 (by 0.1 step). and it should not be empty at the value=0.4
upload_2018-4-11_14-26-42.png
 

Attachments

  • upload_2018-4-11_14-23-43.png
    upload_2018-4-11_14-23-43.png
    367 bytes · Views: 662
  • upload_2018-4-11_14-26-42.png
    upload_2018-4-11_14-26-42.png
    4.8 KB · Views: 596
Are you sure your data have values in the interval [0.4, 0.5)?
 
Yes, these are my data
0.70 0.10 0 0.80 0.80 1 0.80 1.2 0.60 2.50 0 0.5 0.70 0.60 0.10 0.60 0.70 0.60 0.10 0.20 0.70 3.1 0.40 0.3
0.500000000000000
0.800000000000000
0.100000000000000
0.200000000000000
1.50000000000000
0.800000000000000
1.60000000000000
0.200000000000000
1
1
0.200000000000000
0.900000000000000
0.300000000000000
0
0.600000000000000
0.400000000000000
0.100000000000000
1.10000000000000
0.600000000000000
1
0.200000000000000
0.200000000000000
0.100000000000000
0.700000000000000
1.40000000000000
1.40000000000000
1.10000000000000
0.400000000000000
0.900000000000000
0.600000000000000
1.80000000000000
0.100000000000000
0.900000000000000
1.10000000000000
0.700000000000000
0.300000000000000
0.400000000000000
2.90000000000000
0.300000000000000
0.300000000000000
0.600000000000000
1.30000000000000
0.600000000000000
1.50000000000000
0.100000000000000
0.100000000000000
0.600000000000000
1.60000000000000
0.200000000000000
1.50000000000000
0.200000000000000
0.600000000000000
0.300000000000000
1.10000000000000
0.500000000000000
0
1
0.400000000000000
0.400000000000000
4
0.200000000000000
0.100000000000000
0.900000000000000
0.700000000000000
0.800000000000000
0.500000000000000
0.300000000000000
0.700000000000000
0.300000000000000
0.900000000000000
1
0.900000000000000
0.600000000000000
0.600000000000000
2.50000000000000
0.600000000000000
0.400000000000000
0.100000000000000
0.800000000000000
2.30000000000000
0.900000000000000
0.200000000000000
0
0.200000000000000
0.500000000000000
3
1.20000000000000
7.50000000000000
1.30000000000000
0.300000000000000
0
1
1.70000000000000
0.400000000000000
1.50000000000000
0.700000000000000
0.200000000000000
0.400000000000000
0.400000000000000
0.100000000000000
0.100000000000000
0.800000000000000
3.70000000000000
3.40000000000000
0.300000000000000
0.100000000000000
0.400000000000000
0
1.20000000000000
0
0.800000000000000
0.100000000000000
0.100000000000000
0.100000000000000
0.300000000000000
0.500000000000000
1.40000000000000
0.400000000000000
0.400000000000000
0.400000000000000
0.100000000000000
1.10000000000000
0.400000000000000
0.400000000000000
0
0.100000000000000
0.300000000000000
0.700000000000000
0.500000000000000
 
Last edited:
  • #10
The unexpected behavior comes from your use of the : operator to make your array of bin edges, in the statement:
Code:
L = 0:0.1:8;

There is a small floating point error in the bin edges that causes data points such as 0.300000000000000, which lie (almost) exactly at the edge between two bins, to be inconsistently included into one (intended) bin as opposed to the next.

To illustrate what is happening, try the following code:
Code:
a = L - 0.3;
a(4)
 
  • #11
Soory but i really don't see how i can fix it. I tried what you said but it changes nothing iot gives the same histogram
 
  • #12
olivermsun said:
Are you sure your data have values in the interval [0.4, 0.5)?
olivermsun said:
The unexpected behavior comes from your use of the : operator to make your array of bin edges, in the statement:
Code:
L = 0:0.1:8;

There is a small floating point error in the bin edges that causes data points such as 0.300000000000000, which lie (almost) exactly at the edge between two bins, to be inconsistently included into one (intended) bin as opposed to the next.

To illustrate what is happening, try the following code:
Code:
a = L - 0.3;
a(4)
what do you mean by between two bins, why this happens only with 0.3 and 0.7??
 
  • #13
You are using histc() to break up the data into bins, with the border between bins at 0.0, 0.1, 0.2, ..., right?
How does it decide where to put the data point 0.300000000000000?
 

Similar threads

Replies
4
Views
3K
Replies
8
Views
2K
Replies
12
Views
3K
Replies
3
Views
3K
Replies
1
Views
5K
Replies
1
Views
3K
Replies
6
Views
5K
Replies
1
Views
5K
Back
Top