Reading all .csv files with different names

In summary, the conversation discussed the issue of reading multiple .csv files with different names in a folder. It was suggested to use the dir() function to get a list of all the files and then use a loop to load them one by one. The conversation also mentioned the importance of understanding the underlying OS, such as Linux, when working with intermediate applications like Python.
  • #1
member 428835
Hi PF!

I am trying to read all the .csv files in a folder. When I execute
Code:
OFdata  = readtable('/home/josh/Documents/NASA/PSI_DATA/Double_Drain/ICF1-9/OpenFOAM_DATA/*.csv');
I get a general error. However, replacing the asterisk with the file name, it load no issue.

I've seen online how to read several .csv files with only a numeric difference in name (for loops), but is there a way to read all .csv files with different names?
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Code:
list  = dir('/home/josh/Documents/NASA/PSI_DATA/Double_Drain/ICF1-9/OpenFOAM_DATA/*.csv');
This should give you a list of all the csv files. You can then use a loop to load them one by one.
 
  • Like
Likes member 428835, jim mcnamara and jedishrfu
  • #3
dir() calls the Linux OS - which globs the filenames it can find with the pattern: "*.csv". Then dir() executes stat() to get attributes -- on each file it finds. Like filetimes, file ownership, etc. - you can ask it do lots of tricks.

You can see the the one call python asks Linux to make for mulitiple objects. when you read the man page for glob. It works on filesystems and file, not what are categorized as python __dict__ objects.

From bash -- dir() is a window-ism for ls:
Code:
man glob
Code:
man ls

Most of the Linux OS is written in the C language, so part of the glob page shows how call it in your own C program. And explains lots details.

python acts as an intermediate for you.

But this is why I mentioned to you earlier about learning UNIX or Linux. Anytime an intermediate (interpreted usually) app does some OS good thing for you, you can control it better when you understand how something works underneath.

... followup from our earlier conversation.
 
  • Like
Likes member 428835
  • #4
Thank you both! Very helpful, and Jim, I'm working on getting better with Linux. Starting with some simple shell scripts, baby steps.
 

1. How can I read multiple .csv files with different names in one go?

In order to read all .csv files with different names, you can use a for loop to iterate through all the files in a specified directory. Within the loop, you can use the read_csv() function from the pandas library to read each file and concatenate them into one dataframe.

2. Can I specify a specific file name pattern to read only certain .csv files?

Yes, you can use the glob library to specify a file name pattern. For example, if you only want to read .csv files that start with "data" and end with a number, you can use the pattern "data*.csv" in your code.

3. Is it possible to skip certain rows or columns while reading the .csv files?

Yes, you can use the parameters "skiprows" and "usecols" in the read_csv() function to specify which rows or columns to skip. These parameters can take in a list of row or column numbers to skip or a range of numbers.

4. What if my .csv files have different delimiters or encodings?

You can use the "delimiter" and "encoding" parameters in the read_csv() function to specify the delimiter and encoding used in your .csv files. This will ensure that the files are read correctly and the data is not distorted.

5. How can I handle missing or inconsistent data while reading multiple .csv files?

You can use the "na_values" and "keep_default_na" parameters in the read_csv() function to handle missing data. The "na_values" parameter allows you to specify the values that should be considered as missing, while the "keep_default_na" parameter determines whether the default NaN values should be replaced with the values specified in "na_values".

Similar threads

  • MATLAB, Maple, Mathematica, LaTeX
Replies
5
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
1K
  • Programming and Computer Science
Replies
18
Views
1K
  • Programming and Computer Science
Replies
19
Views
1K
  • Programming and Computer Science
Replies
16
Views
3K
  • Programming and Computer Science
Replies
6
Views
10K
  • Programming and Computer Science
Replies
4
Views
5K
  • Programming and Computer Science
3
Replies
75
Views
4K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
3K
  • Programming and Computer Science
Replies
9
Views
3K
Back
Top