Comparing array of data is fortran

  • Context: Comp Sci 
  • Thread starter Thread starter vjramana
  • Start date Start date
  • Tags Tags
    Array Data Fortran
Click For Summary

Discussion Overview

The discussion revolves around comparing two arrays of data in Fortran, specifically focusing on counting occurrences of data points from one array in another and organizing these counts into bins. The participants explore coding strategies, file I/O, and the logic behind the binning process.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant, Vijay, describes a need to compare two data files and count matches into bins but expresses uncertainty about how to implement this in code.
  • Another participant suggests that Vijay should first attempt to write the code, emphasizing the need for understanding file I/O and loops.
  • Vijay shares a code snippet that attempts to implement the desired functionality but receives feedback on its structure and logic, including the use of unnecessary variables and nested loops.
  • Participants discuss the purpose of certain code elements, such as the creation of bin partitions and the clarity of variable usage, with some suggesting simplifications.
  • Vijay clarifies that the bin size is intended to be adjustable and that he wants to count occurrences of data points in a more structured way, asking for further help.
  • Another participant points out that the binPart array is created but not utilized effectively in the code, suggesting a more direct approach to counting occurrences.
  • Vijay provides an illustrative example using coordinates and bins to clarify his intended approach, detailing how he envisions counting points within specified regions.

Areas of Agreement / Disagreement

Participants generally agree on the need for clarity in the code and the logic behind the binning process, but there are multiple competing views on how best to implement the solution and whether certain elements of the code are necessary.

Contextual Notes

There are unresolved issues regarding the clarity of variable usage, the necessity of certain code structures, and the overall logic of counting occurrences in bins. The discussion reflects varying interpretations of the problem and potential solutions.

vjramana
Messages
13
Reaction score
0
Hi FORTRAN experts,

I have two arrays of data, called data1.dat and data2.dat. each contains 60 data. What I want to do is to compare the data in each file and write the counting into bins. It goes like this. First, take the first data in data1.dat file and compare with the 60 data in data2.dat file. If there is any data which is same with the data in data1.dat then it count in bin. The total bins are also 60. Next it goes to the second data in data1.dat and compare with all the 60 data in data2.dat. If there is any data same then it add in second bin. And it repeats to all the data in data1.dat

I am not very sure how to write this code.

Can anyone help me?

Thank you in advance.

Vijay
 
Physics news on Phys.org
vjramana said:
Hi FORTRAN experts,

I have two arrays of data, called data1.dat and data2.dat. each contains 60 data. What I want to do is to compare the data in each file and write the counting into bins. It goes like this. First, take the first data in data1.dat file and compare with the 60 data in data2.dat file. If there is any data which is same with the data in data1.dat then it count in bin. The total bins are also 60. Next it goes to the second data in data1.dat and compare with all the 60 data in data2.dat. If there is any data same then it add in second bin. And it repeats to all the data in data1.dat

I am not very sure how to write this code.

Can anyone help me?

Thank you in advance.

Vijay

According to the rules of this forum, you need to make an effort to write the code for this problem before we can give you any help.

To do this problem you need to know about file I/O, arrays, and for loops to iterate through the data in both files.
 
Yea I have tried writing the code. It is as below:


program array
implicit none
!
integer, parameter :: D = 60 ! Number of data
integer, parameter :: BB=60 ! Number if bins
integer i, j, k
integer, dimension(1:D) :: xData, yData
integer, dimension(1:BB) :: bins
integer, dimension(0:BB) :: binPart ! Number of partition

! open data from input file
open(unit=40,status="unknown",file="data1.dat")
open(unit=41,status="unknown",file="data2.dat")
open(unit=50,status="unknown",file="head-binCount.dat")

! read data from input file
read(40,*) (xData(i), i=1, D)
read(41,*) (yData(i), i=1, D)

! create bin partitions
do i=0, BB
binPart(i) = 1 * i
end do

! assign zero value to the bins
do k=1, BB
bins(k)=0
end do

! count the data into the bins according to the criteria
do i=1, D
do k=1,D
do j=0, BB
if (xData(i) == yData(k) )then
bins(j+1) = bins(j+1) + 1
end if
!
end do
end do
end do

! write output in a file
do k=1, BB
write(50, *) k, binPart(k-1), binPart(k), bins(k)
end do

end
 
vjramana said:
Yea I have tried writing the code. It is as below:


program array
implicit none
!
integer, parameter :: D = 60 ! Number of data
integer, parameter :: BB=60 ! Number if bins
integer i, j, k
integer, dimension(1:D) :: xData, yData
integer, dimension(1:BB) :: bins
integer, dimension(0:BB) :: binPart ! Number of partition

! open data from input file
open(unit=40,status="unknown",file="data1.dat")
open(unit=41,status="unknown",file="data2.dat")
open(unit=50,status="unknown",file="head-binCount.dat")

! read data from input file
read(40,*) (xData(i), i=1, D)
read(41,*) (yData(i), i=1, D)

! create bin partitions
do i=0, BB
binPart(i) = 1 * i
end do

! assign zero value to the bins
do k=1, BB
bins(k)=0
end do

! count the data into the bins according to the criteria
do i=1, D
do k=1,D
do j=0, BB
if (xData(i) == yData(k) )then
bins(j+1) = bins(j+1) + 1
end if
!
end do
end do
end do

! write output in a file
do k=1, BB
write(50, *) k, binPart(k-1), binPart(k), bins(k)
end do

end

1. Since all your files contain the same number of data, you don't need two different variables - D and BB.

2. What is the purpose of this code?
! create bin partitions
do i=0, BB
binPart(i) = 1 * i
end do
3. Why do you need bin partitions? And why are you assigning the value 1*i? Obviously that's the same as i.

4. You have a triple-nested loop that does the comparison. I would do this with a double-nested loop by iterating through xData array comparing a given entry with each entry in yData, storing the number of hits in the bins array. For example, if xData(2) = 15, and the yData array contains three elements that are 15, bins(2) would be set to 3.

Next, I would go do the same thing with the yData array, seeing how many elements in the xData array match a given element in yData, storing the count in a different array, bins2. Your description of what you needed to do was not very clear.
 
Dear Sir,

1) It is true that I can use the same variables for D and BB. But I used them just for my clarity purpose.

2) I used 1 * i since the bin size I used was 1. If the size is 0.5 than I need to replace the 1 with 0.5. This also just for my clarity purpose.

3) Actually what I want is, I want to count the number of occurrences of xData(i) and yData(j) and write the counting into bins... ( just call the bin as zdata_bin(i,j). ). I do not know how to write the code to do this task.

Could you kindly help?
Thank you.
Vijay
 
vjramana said:
Dear Sir,

1) It is true that I can use the same variables for D and BB. But I used them just for my clarity purpose.
If you use two variables when only one is needed, that is not making things clearer.
vjramana said:
2) I used 1 * i since the bin size I used was 1. If the size is 0.5 than I need to replace the 1 with 0.5. This also just for my clarity purpose.
You've completely lost me here. How can the bin size be .5? What does bin size mean in the context of your program?
vjramana said:
3) Actually what I want is, I want to count the number of occurrences of xData(i) and yData(j) and write the counting into bins... ( just call the bin as zdata_bin(i,j). ). I do not know how to write the code to do this task.
This is not clear either. It would help me understand what you are trying to do if you came up with a small example, with two small arrays (10 elements or so in each) and what the 3rd array would look like. No code.
vjramana said:
Could you kindly help?
Thank you.
Vijay
 
You create an integer arrray call binPart, apparently to be used for indexing into bins, but never use it. This is a programming issue, not a language (Fortran) issue. It appears that the changes you need to make are the ones shown below. There's no need to use the "j" index variable.

...
integer, dimension(0:D) :: binPart ! needs to be large enough to translate all possible index values
...
do i=0, D
binPart(i) = 1 * i
...
bins(binPart(i)) = bins(binPart(i)) + 1
...
write(50, *) k, bins(k)
...
 
Last edited:
Dear sir,

I may want to explain like this. Imagine we have x,y and z co-ordinate.
Lets say I have 10 values along x-axis and another 10 values along y-axis. The co-ordinates may look like this (x,y).

(1,4), (4,6), (2,6), (1,4), (2,6), (8,0), (1,4), (2,6), (1,4), (8,0)

Additionally I put 5 bins in the size of two units along x-axis and y-axis. This could be imagined as
x y
binx1= (0 to 2) biny1= (0 to 2)
binx2= (2 to 4) biny2= (2 to 4)
binx3= (4 to 6) biny3= (4 to 6)
binx4= (6 to 8) biny4= (6 to 8)
binx5= (8 to 10) biny5= (8 to 10)

Now if we see,
in the region of binx1 and biny2 there are 4 points.
In the region of binx1 and biny3 there are 3 points.
In the region of binx2 and biny3 there is only 1 point
and
in the region of binx4 and biny1 there are 2 points.

The total points in the regions are equal to the total (x,y) points.

My z-axis (where here I imagined as normal to the x y plane) would represent the number of points present in each location of the region ( like 4,3,1,and 2 as in the example)

This is what I am trying to do to plot a contour graph. using binx biny and binz later

Hope this explanation gives you better picture of the problem.

Thank you
sir

Regards
Vijay
 
You might not have noticed that with five pairs of bins binx1, binx2, ..., binx2, and biny1, biny2, ..., biny5, a point could go into anyone of 25 bins. The number of bins needed would depend on the range of data in the two dimensions, and the width of each bin. In your small example, the range for x and y was 0 through 10, and you chose a bin width of 2.

And if you have 60 data points, and a bin width of 1 (as you had in your original code), you might need up to 3600 separate bins. Again, the number of bins would depend on the range of the data, and the number of subintervals along each axis.

When you plot the values in pairs of bins, what you're getting is a frequency histogram, not a contour plot.
 
  • #10
Dear Sir,

Thanks for your explanation.
You are correct sir. Actually what I want is the frequency histogram.
So in order to get this how the code should be?
I need help in this.
I appreciate your help in advance.

In this process, I have rewritten the code. Below is the code :-

**************************************************************************************
program dummy
implicit none
!
integer :: i,j,k,l
integer,parameter :: noData=30
integer,parameter :: noBins=5
integer,parameter :: binSize=2
integer,dimension(1:30) :: xdata, ydata
integer,dimension(0:noBins) :: xBINS,yBINS, xbinPart, ybinPart
!integer,dimension(0:noBins, 0:noBins) :: xyBins, xyBinPart

! OPEN FILES
open(unit=50,status="unknown",file="xyData.dat",form="formatted")
open(unit=51,status="old",file="xDATA.dat",form="formatted")
open(unit=52,status="old",file="yDATA.dat",form="formatted")

! READ DATA
DO i=1,noData
READ(51,*) xdata(i)
READ(52,*) ydata(i)
END DO

! PARTITION FOR BIN SIZE
do i=0, noBins
xbinPart(i) = binSize * (i)
ybinPart(i) = binSize * (i)
end do

! ASSIGIN ZERO VALUES IN EACH BIN

do i=0, noBins
xBINS(i) = 0
yBINS(i) = 0
end do

! CLASSIFY THE DATA
do k=1,noData
do l=0, noBins
!if((xdata(k).ge.xbinPart(l).and.xdata(k).lt.xbinPart(l+1)).and.(ydata(k).ge.ybinPart(l).and.ydata(k).lt.ybinPart(l+1)))then
!if((xdata(k).ge.xbinPart(l).and.ydata(k).ge.ybinPart(l)).and.(xdata(k).lt.xbinPart(l+1).and.ydata(k).lt.ybinPart(l+1)))then
!if(xdata(k).ge.xbinPart(l).and.(xdata(k).lt.xbinPart(l+1)))then
!xBINS(l) = xBINS(l) + 1
!yBINS(l) = yBINS(l) + 1

if (ydata(k).ge.ybinPart(l).and.(ydata(k).lt.ybinPart(l+1)))then
yBINS(l) = yBINS(l) + 1
end if
end do
end do

! PRINT OUT
do i = 0, noBins
!print*,"binNo",i, " ", xBINS(i), yBINS(i)
print*,"binNo",i," ",yBINS(i)
end do

end program dummy
******************************************************************************

Vijay
 
Last edited:
  • #11
This is the section that probably needs work:
Code:
! CLASSIFY THE DATA
do k=1,noData
do l=0, noBins
!if((xdata(k).ge.xbinPart(l).and.xdata(k).lt.xbinP art(l+1)).and.(ydata(k).ge.ybinPart(l).and.ydata(k ).lt.ybinPart(l+1)))then
!if((xdata(k).ge.xbinPart(l).and.ydata(k).ge.ybinP art(l)).and.(xdata(k).lt.xbinPart(l+1).and.ydata(k ).lt.ybinPart(l+1)))then
!if(xdata(k).ge.xbinPart(l).and.(xdata(k).lt.xbin Part(l+1)))then
!xBINS(l) = xBINS(l) + 1
!yBINS(l) = yBINS(l) + 1

if (ydata(k).ge.ybinPart(l).and.(ydata(k).lt.ybinPart (l+1)))then
yBINS(l) = yBINS(l) + 1
end if
end do
end do
This is very difficult to read, because there are no spaces except for those that are syntax errors (e.g. xbin Part, which should be xbinPart).

Also, instead of using an if statement nested four levels deep, a logic structure using if ... then ... else if ... then ... else if... then ... end if would be easier to understand.

Now that I understand better what you're trying to do, it seems to me that your x and y bins are not necessary. All you need is a two-dimensional bin, like what you have for xyBins.

For this to work your program needs to know the ranges of x and y values. In the sample points you supplied earlier, all the x values were between 1 and 8, inclusive, and the y values were between 0 and 6. If you assume that all values are between 0 and 10, and that the bin size is 2, you can test the x value to determine the first coordinate of the bin it should go in, and test the y value for the second coordinate of the bin it should go in.

For example with bins 0 - 2, 2 - 4, 4 - 6, 6 - 8, and 8 - 10, the point (8, 0) would go in xyBins(4, 0). I am counting bin numbers from zero; i.e., bin 0 is 0 - 2, and bin 4 is 8 - 10.
 
  • #12
The actual section as below::

! CLASSIFY THE DATA
do k=1,noData
do l=0, noBins
if (ydata(k).ge.ybinPart(l).and.(ydata(k).lt.ybinPart (l+1)))then
yBINS(l) = yBINS(l) + 1
end if
end do
end do
 
  • #13
I understand that there are a lot of lines commented out in the code, but take a closer look at what I wrote in post #11.
 

Similar threads

Replies
7
Views
3K
  • · Replies 11 ·
Replies
11
Views
7K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 16 ·
Replies
16
Views
4K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 4 ·
Replies
4
Views
8K