Register to reply 
Can't read strange mixed format text file in Matlab 
Share this thread: 
#1
Jun311, 05:41 PM

P: 28

1. The problem statement, all variables and given/known data
I have to read a text file with fixed format as follows The first column is supposed to be year, 2nd is month and 3rd is day The file has following format: 4d 2d 2d f9.4 f9.4 f9.4 f9.4 4d 4d 4d 4d 4d 4d
3. The attempt at a solution I tried to use textscan or fscanf but it just didn't work. The program always had problem with the first 3 column E.g.
madtraveller 1. The problem statement, all variables and given/known data 2. Relevant equations 3. The attempt at a solution 


#2
Jun311, 06:30 PM

Sci Advisor
HW Helper
PF Gold
P: 3,169

What's going on with rows 3 and 7? Instead of a 3digit number in the second column like all the others, these have two 1digit numbers.
So the number of elements on each line is not consistent. Also, why are you using 3 fscanf statements? 


#3
Jun311, 06:31 PM

Sci Advisor
HW Helper
PF Gold
P: 3,169

P.S. You don't need to specify the number of digits when reading, e.g. you can use %d instead of %4d.



#4
Jun311, 06:38 PM

P: 28

Can't read strange mixed format text file in Matlab
jbunniii, thank you for your answer
As I stated: The first column is supposed to be year, 2nd is month and 3rd is day So The 1st row is 2000 Feb 8th The 2nd row is 2000 Feb 26th The 3rd row is 2000 March 5th ... The 7th row is 2000 April 7th So the format is fixed: 4d 2d 2d That's the nasty thing about this file format :) I tried 3 fscanf statements to see which one worked. Apparently neither of them did madtraveller 


#5
Jun311, 06:49 PM

Sci Advisor
HW Helper
PF Gold
P: 3,169

OK, thanks for the clarification.
Try the following: data = fscanf(fid, '%d %1d%2d %f %f %f %f %d %d %d %d %d %d'); 


#6
Jun311, 06:54 PM

Sci Advisor
HW Helper
PF Gold
P: 3,169

P.S. That will read all the data into a single column vector. If you would rather read it into a matrix of the dimensions of the original file, try this:
data = fscanf(fid, '%d %1d%2d %f %f %f %f %d %d %d %d %d %d', [13,inf]); This will reverse your rows and columns, so follow it with data = data'; to get the same dimensions as the file. 


#7
Jun311, 07:10 PM

P: 28

Thank you so much jbunniii. It works perfectly now
Have a great weekend madtraveller 


#8
Jun1011, 02:06 AM

P: 28

I've just realized that the code suggested above didn't work correctly if I have a longer data like this. Again the first column is Year (4d format), 2nd column is Month (2d) and 3rd column is Day (2d)
Could anyone help me with a better solution? Thx madtraveller 


#9
Jun1011, 02:37 AM

Sci Advisor
P: 1,724

Your columns (year, month, and date) are bleeding into eachother. I'd suggest either better delimiting (e.g. Using tabs or forcing fixed numbers of spaces between columns) in whatever you're using to generate these values or going in and manually or automatically increasing the delimiting between them.



#10
Jun1011, 02:51 AM

P: 28

MATLABdude,
Thank you. So there is no way to read that kind of file using Matlab??? That's really strange to me tbh :) That's file was provided on a website and I can read it quite easily using R. However for some reasons I would like to read it using Matlab and don't want to call R script from Matlab madtraveller 


#11
Jun1011, 08:20 AM

Sci Advisor
P: 1,724

Are you certain that R actually reads it correctly?
I was hoping that textread (replaced by the notquite dropin textscan in newer versions of MATLAB) would do the trick, but the problem seems to be in how it treats the whitespace when there is and is not a space between the year and month, or month and date. Too bad, since it almost did the trick (textscan may still, but I don't have access to a newer version of MATLAB on this computer). http://www.mathworks.com/help/techdoc/ref/textread.html http://www.mathworks.com/help/techdoc/ref/textscan.html Actually, I have a suspicion that that's probably the C standard. Nevertheless, you can still get around this issue by treating the first 8 characters of every line as a date string and then doing conversion using some matrix operations and the str2num function: http://www.mathworks.com/help/techdoc/ref/str2num.html The following code (for use with your most recent dataset starting in 1948) uses textread, but you'll have to modify a little for the newer textscan (it produces cell arrays instead of forcing you to explicitly declare variable names for every column).



#12
Jun1011, 01:41 PM

P: 28

Thank you very much MATLABdude. Your code really did the trick.
Have a good weekend madtraveller 


#13
Jun1011, 01:50 PM

P: 28

Forgot adding the R code. R could really does this very easily with read.fwf :)



Register to reply 
Related Discussions  
Gaussian 09 .out file format specification (and freqchk)  Chemistry  4  
Read CSV format in C/C++  Programming & Computer Science  7  
How can Capture Text form image file and save it as txt file or word file  Computers  2  
Format text(ifstream, ofstream)  Programming & Computer Science  9  
Read the mixed up words  Fun, Photos & Games  16 