Reading in ASCII Data in IDL with Format Codes

polystethylene · Feb 12, 2009

Hi all,

I have a tab delimited ASCII file (an output from IRAF). I need to read it into IDL, but it features a timestamp column (in UT - hh:mm:ss). I assume I need to read it in with some format codes, but I'm not familiar with fortran or C (or IDL to be honest), and it seems the format code system is borried from both fortran and C.

The first few lines look this:

00:05:54 1.741252 1 788.818 38.071 22525.62 19.316 0.009
00:05:54 1.741252 2 434.052 49.973 5557.795 20.836 0.032
00:05:54 1.741252 3 461.841 66.111 6695.824 20.633 0.026
00:05:54 1.741252 4 390.721 105.630 4991.148 20.952 0.035
00:05:54 1.741252 5 61.415 119.133 5739.358 20.801 0.030

Another problem I can see is that not every value in each column is of exactly the same format (e.g. differing numbers of characters before the decimal point). The column of integers also goes from single figure to double figure.

So far I've tried this:

PRO READONELINE
infile = 'LCgen/20090127/fbexp062_2.dump'
OPENR,lun,infile,/GET_LUN
data = fltarr(10)
READF,lun,format = '(I0,":",I0,":",I0,F5,Q)',data
print,data
close,lun
END

Can anyone help me?

signerror · Feb 12, 2009

You want regular expressions. I don't know what language you're using, but you definitely want to look up its documentation and see how it implements regexps.

Your timestamp matches/\d{2}:\d{2}:\d{2}/, and your data fields match /\d+(\.\d+)?/.

Or you could go for a low-tech solution, spitting data fields along spaces. In most languages there is a string 'split' method to do this,

Code:

>>> "00:05:54 1.741252 1 788.818 38.071 22525.62 19.316 0.009".split(" ")
['00:05:54', '1.741252', '1', '788.818', '38.071', '22525.62', '19.316', '0.009']

Which you can quickly replicate in any low level language, I think.

polystethylene · Feb 12, 2009

Hi signerror, thanks for the help. Never come across regular expressions before, are you suggesting them as a solution because they will be able to overcome the issue of needing different format codes for each row (to account for the changing number of characters?)

signerror · Feb 12, 2009

If you use C, use fscanf from <stdio.h>. I don't know what the Fortran equivalent is.

http://en.wikipedia.org/wiki/Scanf

The variable string lengths of the numeric representations should not make a difference in any reading function. The different sizes don't matter: they are all parsed equally as floating-point numbers.

I take back what I said: because you are working in C and Fortran, you do not have native support for regular expressions.

polystethylene · Feb 12, 2009

I'm actually working in IDL, which apparently has many similarities to C and fortran...

I currently have this:

PRO READONELINE
infile = 'LCgen/20090127/fbexp062_2.dump'
OPENR,lun,infile,/GET_LUN
rows = FILE_LINES(infile)

data = fltarr(10,rows)
WHILE NOT EOF(lun) DO BEGIN
READF,lun,format = '(I2,1x,I2,1x,I2,2x,F0,2x,I0,5F0,/)',data
ENDWHILE
close,lun
free_lun,lun
print,data
END

Which is reading the following data:

01:34:48 1.398525 1 782.147 31.966 22948.04 19.296 0.008
01:34:48 1.398525 2 427.381 43.868 6471.269 20.670 0.025
01:34:48 1.398525 3 455.170 60.006 6762.172 20.623 0.024
01:34:48 1.398525 4 384.050 99.521 5200.847 20.908 0.031
01:34:48 1.398525 5 54.744 113.028 5752.683 20.798 0.029

in as thus (upon doing a print,data command):

1.00000 34.0000 48.0000 1.39852 1.00000 782.147
31.9660 22948.0 19.2960 0.00800000
1.00000 34.0000 48.0000 1.39852 3.00000 455.170
60.0060 6762.17 20.6230 0.0240000
1.00000 34.0000 48.0000 1.39852 5.00000 54.7440
113.028 5752.68 20.7980 0.0290000

(different file from that above, but same format of data) - I can't stop it from what appears to be running over into a new line (and thus also alternating between data lines)... Hence my WHILE NOT loop, whose length is based on the number of rows in the file, is always cut short with:

% Procedure was compiled while active: READONELINE. Returning.
% Compiled module: READONELINE.
% READF: End of file encountered. Unit: 107, File: LCgen/20090127/fbexp062_2.dump
% Execution halted at: READONELINE 8 /Users/stefan/IDLWorkspace/LCgen/20090127/readoneline.pro
% $MAIN$
I wish I had taken any sort of programming course during my undergrad days =(

Reading in ASCII Data in IDL with Format Codes

Thread 'A Crisis for Newly Minted CompSci Majors -- entry level jobs gone'

Thread 'Who is responsible for the software when AI takes over programming?'

Thread 'Star maps using Blender'

Similar threads

Hot Threads

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Fortran Reading files in pre-f77 - handling end of file

Sequential Analog Computers?

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

Recent Insights

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers