Can Fortran handle cross-referencing on a large array of data?

  • Context: Fortran 
  • Thread starter Thread starter zakynthos
  • Start date Start date
  • Tags Tags
    Fortran
Click For Summary
SUMMARY

The discussion centers on the feasibility of using Fortran to perform cross-referencing on a large dataset comprising a four-column array with one million rows. The original VBA code was inefficient, taking approximately 12.93 hours on a typical office PC. By implementing sorting and utilizing VLOOKUP, the runtime was reduced to about one second. The user seeks assistance in converting the optimized algorithm into Fortran for execution on Fujitsu's K supercomputer.

PREREQUISITES
  • Understanding of Fortran programming language
  • Familiarity with array data structures
  • Knowledge of sorting algorithms
  • Experience with relational database concepts
NEXT STEPS
  • Research Fortran array manipulation techniques
  • Learn about optimizing algorithms for large datasets
  • Explore sorting algorithms suitable for Fortran
  • Investigate performance benchmarks on Fujitsu's K supercomputer
USEFUL FOR

Programmers transitioning from VBA to Fortran, data scientists handling large datasets, and anyone interested in optimizing cross-referencing algorithms for high-performance computing environments.

zakynthos
Messages
2
Reaction score
0
Hi, I've no knowledge of Fortran but am researching its (theoretical) use to perform a 'cross- referencing' function on data in a four column array of one million rows. The program matches data in column A to Column D by referencing the matching values in columns B and C.

Here's the program I'm using - could anyone convert this to Fortran and could this in, in practice, run on Fujitsu's K (10 teraflops) computer. If so, how long would it take the K to perform this task - I've calculated 12.93 hrs from tests on a typical office PC? With thanks.

Sub AddCodes()

Dim x As Long, LastRow As Long, UB As Long, BList As Variant

Const StartRow As Long = 1
LastRow = Cells(Rows.Count, "A").End(xlUp).Row

BList = Join(WorksheetFunction.Transpose(Cells(StartRow, "B").Resize(LastRow - StartRow + 1)), "/")
' Application.ScreenUpdating = False

For x = StartRow To LastRow
UB = UBound(Split(Split(BList, Cells(x, "C").Value)(0), "/"))
If UB >= 0 Then
With Cells(x, "D")
.Value = .Value & "," & Range("A1").Offset(UB).Value
If Left(.Value, 1) = "," Then .Value = Mid(.Value, 2)
.Interior.ColorIndex = 6
.Font.Bold = True
End With
End If
Next

Application.ScreenUpdating = True

End With

End Sub
 
Technology news on Phys.org
If it takes 12 hours on a PC your algorithm is hopelessly inefficient.

If you first sorted the table on the relevant columns and then did the matching, I would expect the run time to be of the order of 1 second not 12 hours.

Relational database operations are a good way to describe what you want to do, but they are often a bad way to actually do it.
 
Many thanks for your answer - your advice made me re-think the code and I've processed the array with vlookup instead and yes, you're right of course, about a second!

However, I'm still interested to knnow how my original code qwould translate into Fortran.

Thanks once again!
 

Similar threads

Replies
7
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
12K
  • · Replies 10 ·
Replies
10
Views
26K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 7 ·
Replies
7
Views
13K