Reading in ASCII Data in IDL with Format Codes

AI Thread Summary
The discussion revolves around reading a tab-delimited ASCII file containing timestamp data into IDL. The user faces challenges due to the varying formats of the data, particularly in the timestamp and integer columns. They initially attempt to use a specific format code in IDL's READF function but encounter issues with inconsistent data lengths. Suggestions include using regular expressions for parsing, although the user is unfamiliar with them. A low-tech solution is proposed, utilizing string splitting methods to handle the varying formats. The user later shares an updated version of their code, which reads the data but still struggles with formatting issues, particularly with line breaks causing data misalignment. The conversation highlights the complexities of data parsing in IDL and the need for effective handling of variable-length data fields.
polystethylene
Messages
17
Reaction score
0
Hi all,

I have a tab delimited ASCII file (an output from IRAF). I need to read it into IDL, but it features a timestamp column (in UT - hh:mm:ss). I assume I need to read it in with some format codes, but I'm not familiar with fortran or C (or IDL to be honest), and it seems the format code system is borried from both fortran and C.

The first few lines look this:

00:05:54 1.741252 1 788.818 38.071 22525.62 19.316 0.009
00:05:54 1.741252 2 434.052 49.973 5557.795 20.836 0.032
00:05:54 1.741252 3 461.841 66.111 6695.824 20.633 0.026
00:05:54 1.741252 4 390.721 105.630 4991.148 20.952 0.035
00:05:54 1.741252 5 61.415 119.133 5739.358 20.801 0.030

Another problem I can see is that not every value in each column is of exactly the same format (e.g. differing numbers of characters before the decimal point). The column of integers also goes from single figure to double figure.

So far I've tried this:

PRO READONELINE
infile = 'LCgen/20090127/fbexp062_2.dump'
OPENR,lun,infile,/GET_LUN
data = fltarr(10)
READF,lun,format = '(I0,":",I0,":",I0,F5,Q)',data
print,data
close,lun
END

Can anyone help me?
 
Technology news on Phys.org
You want regular expressions. I don't know what language you're using, but you definitely want to look up its documentation and see how it implements regexps.

Your timestamp matches/\d{2}:\d{2}:\d{2}/, and your data fields match /\d+(\.\d+)?/.

Or you could go for a low-tech solution, spitting data fields along spaces. In most languages there is a string 'split' method to do this,

Code:
>>> "00:05:54 1.741252 1 788.818 38.071 22525.62 19.316 0.009".split(" ")
['00:05:54', '1.741252', '1', '788.818', '38.071', '22525.62', '19.316', '0.009']

Which you can quickly replicate in any low level language, I think.
 
Hi signerror, thanks for the help. Never come across regular expressions before, are you suggesting them as a solution because they will be able to overcome the issue of needing different format codes for each row (to account for the changing number of characters?)
 
If you use C, use fscanf from <stdio.h>. I don't know what the Fortran equivalent is.

http://en.wikipedia.org/wiki/Scanf

The variable string lengths of the numeric representations should not make a difference in any reading function. The different sizes don't matter: they are all parsed equally as floating-point numbers.

I take back what I said: because you are working in C and Fortran, you do not have native support for regular expressions.
 
Last edited:
I'm actually working in IDL, which apparently has many similarities to C and fortran...

I currently have this:

PRO READONELINE
infile = 'LCgen/20090127/fbexp062_2.dump'
OPENR,lun,infile,/GET_LUN
rows = FILE_LINES(infile)

data = fltarr(10,rows)
WHILE NOT EOF(lun) DO BEGIN
READF,lun,format = '(I2,1x,I2,1x,I2,2x,F0,2x,I0,5F0,/)',data
ENDWHILE
close,lun
free_lun,lun
print,data
END

Which is reading the following data:

01:34:48 1.398525 1 782.147 31.966 22948.04 19.296 0.008
01:34:48 1.398525 2 427.381 43.868 6471.269 20.670 0.025
01:34:48 1.398525 3 455.170 60.006 6762.172 20.623 0.024
01:34:48 1.398525 4 384.050 99.521 5200.847 20.908 0.031
01:34:48 1.398525 5 54.744 113.028 5752.683 20.798 0.029

in as thus (upon doing a print,data command):

1.00000 34.0000 48.0000 1.39852 1.00000 782.147
31.9660 22948.0 19.2960 0.00800000
1.00000 34.0000 48.0000 1.39852 3.00000 455.170
60.0060 6762.17 20.6230 0.0240000
1.00000 34.0000 48.0000 1.39852 5.00000 54.7440
113.028 5752.68 20.7980 0.0290000

(different file from that above, but same format of data) - I can't stop it from what appears to be running over into a new line (and thus also alternating between data lines)... Hence my WHILE NOT loop, whose length is based on the number of rows in the file, is always cut short with:

% Procedure was compiled while active: READONELINE. Returning.
% Compiled module: READONELINE.
% READF: End of file encountered. Unit: 107, File: LCgen/20090127/fbexp062_2.dump
% Execution halted at: READONELINE 8 /Users/stefan/IDLWorkspace/LCgen/20090127/readoneline.pro
% $MAIN$
I wish I had taken any sort of programming course during my undergrad days =(
 
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...
Thread 'Star maps using Blender'
Blender just recently dropped a new version, 4.5(with 5.0 on the horizon), and within it was a new feature for which I immediately thought of a use for. The new feature was a .csv importer for Geometry nodes. Geometry nodes are a method of modelling that uses a node tree to create 3D models which offers more flexibility than straight modeling does. The .csv importer node allows you to bring in a .csv file and use the data in it to control aspects of your model. So for example, if you...

Similar threads

Back
Top