How to import .CSV table in C++

  • Context: C/C++ 
  • Thread starter Thread starter member 428835
  • Start date Start date
  • Tags Tags
    C++ Table
Click For Summary

Discussion Overview

The discussion revolves around methods for importing and manipulating data from .csv files in C++. Participants explore various approaches, share code snippets, and discuss the challenges associated with parsing CSV data, particularly when it comes to handling commas within strings.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Homework-related

Main Points Raised

  • Some participants suggest using C's scanf and its variations for reading CSV files, while others recommend C++ stream-based features from the iostream header.
  • One participant shares a code snippet for reading a CSV file but notes that it reads data row-wise rather than column-wise, seeking advice on how to store data in a matrix format.
  • Another participant mentions the potential complexity of CSV files due to commas within strings, advising caution and suggesting the use of existing libraries or tools for parsing.
  • Some participants discuss the importance of understanding string parsing functions to convert strings into arrays based on delimiters.
  • There is a mention of the C++ Cookbook as a resource for reading comma-separated files.
  • Several participants express a preference for using Python for CSV manipulation, citing its simplicity and effectiveness compared to C++. One participant also mentions C# as an alternative with built-in functions for CSV handling.
  • Concerns are raised about the safety of using scanf due to potential buffer overflow issues, with references to safer alternatives in Microsoft Visual Studio.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the best method for reading CSV files in C++. There are multiple competing views regarding the use of C versus C++ features, as well as differing opinions on the suitability of various programming languages for this task.

Contextual Notes

Some participants highlight limitations in the provided code snippets, such as the handling of comments and the clarity of variable names. There is also mention of unresolved mathematical steps related to parsing and storing data.

member 428835
Hi PF!

Given a .csv file with the first row a name (string) and successive rows doubles, how can I read and manipulate this data in C++? Typically I just google these sorts of questions, but I'm seeing so many links. Do you have one that you recommend, or do you have a function you've written that does the job?

Thanks so much.
 
Technology news on Phys.org
You can use C's scanf and it's variations. std::format is also something you could look up.
 
  • Like
Likes   Reactions: jedishrfu
joshmccraney said:
Hi PF!

Given a .csv file with the first row a name (string) and successive rows doubles, how can I read and manipulate this data in C++? Typically I just google these sorts of questions, but I'm seeing so many links. Do you have one that you recommend, or do you have a function you've written that does the job?

Thanks so much.
Um, what I/O have you done in C++ programs so far? Reading in a CSV file is about as basic as you can get in programming. Please show links to your reading and learning of basic C++ I/O so far.

And have you at least done this in C?
 
  • Like
Likes   Reactions: Vanadium 50 and jedishrfu
CSV files can be surprisingly difficult because there can be commas within strings that do not separate data. If you are sure that your data does not have that, then you can try reading it and parsing it yourself. Otherwise, I would recommend finding a program that already has dealt with that and using it. It might require preprocessing your data with another program or tool.
 
  • Like
Likes   Reactions: aaroman and jedishrfu
berkeman said:
Um, what I/O have you done in C++ programs so far? Reading in a CSV file is about as basic as you can get in programming. Please show links to your reading and learning of basic C++ I/O so far.

And have you at least done this in C?
I have done zero I/O. Just started learning it last week. I have not done anything in C, but after googling lots (too many links to post) I managed to put together something that does the basics, which is reading the sample.csv file.

C++:
// READ .CSV DATA
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
    ifstream datas; // A CLASS OF THE ifstream LIB ALLOWING FILES TO BE READ
    datas.open("sample.csv"); // OPEN THE CSV FILE 
    while (datas.good()) { // WHILE SOMETHING IS IN THE FILE KEEP READING IT
        string line; // DEFINE A STRING
        getline(datas, line, ','); // GRAB THE ELEMENT UNTIL THE TEXT SEES A COMMA
        cout << line << endl; // PRINT THE ELEMENT
    }
}

However, the .csv file has data categorized by columns. But, the above code reads row-wise. Is there a way to save each column of the CSV file as a vector of numbers, say doubles, creating a matrix of data?

And I'll check out the book, but I've kinda been given several to read, so right now it's a lot.
 
Vanadium 50 said:
You can use C's scanf and it's variations. std::format is also something you could look up.
IMO it's better to use fgetc() or fgets() but don't know if it's a part of C++.
 
Mayhem said:
IMO it's better to use fgetc() or fgets() but don't know if it's a part of C++.
You can do I/O in C++ in either of two ways: using stream-based features declared in the iostream header or using the C standard library features in the stdio.h header. fgetc() and fgets() are part of the stdio.h header.
 
  • Like
Likes   Reactions: jedishrfu
joshmccraney said:
C++:
// READ .CSV DATA
#include <iostream>
#include <fstream>
using namespace std;                     

int main()
{
    ifstream datas; // A CLASS OF THE ifstream LIB ALLOWING FILES TO BE READ
    datas.open("sample.csv"); // OPEN THE CSV FILE
    while (datas.good()) { // WHILE SOMETHING IS IN THE FILE KEEP READING IT
        string line; // DEFINE A STRING
        getline(datas, line, ','); // GRAB THE ELEMENT UNTIL THE TEXT SEES A COMMA
        count << line << endl; // PRINT THE ELEMENT
    }
}
A few comments on your code.
1. It's very unusual to have comments in all caps.
2. Many experts warn against "using namespace std:" as it brings in everything in this very large namespace. A better choice is to limit the identifiers that you import, such as by prefacing these identifiers with the namespace name ( std::count and std::endl). Alternatively, instead of "using namespace std;" you can write "using std::count;" and "using std::endl;" and so on.
3. Your variable datas is of type ifstream, so a better comment would be "input file stream".
4. Your comment // OPEN THE CSV FILE is not very helpful, as it restates what the code is doing. Because ifstream is an input file stream, calling open() opens this stream for reading.
5. The line string line; doesn't really need a comment. It should be obvious to nearly all readers that you are declaring a C++ style string object whose identifier is line.
6. The line count << line << endl; is also obvious to most readers with at least a smattering of knowledge about C++. What it's doing is to insert the contents of the string variable + a new line character into the standard output stream, count. What is less obvious to new C++ programmers is that the actual work is being performed by the stream insertion operator <<.
 
  • Like
Likes   Reactions: pbuk and FactChecker
  • #10
Last edited:
  • #11
Personally I'd do it in
Python:
import csv

columns = {}
# Read from a CSV file with column headings in the first line.
with open('./path/to/my/file.csv') as csvfile:
    file = csv.DictReader(csvfile)
    for name in file.fieldnames:
        columns[name] = []
    for row in file:
        for field in row:
            columns[field].append(float(row[field]))

for name in columns:
    print(name, columns[name], '\n')
 
  • Like
Likes   Reactions: .Scott and jedishrfu
  • #12
pbuk said:
Personally I'd do it in
Python:
import csv

columns = {}
# Read from a CSV file with column headings in the first line.
with open('./path/to/my/file.csv') as csvfile:
    file = csv.DictReader(csvfile)
    for name in file.fieldnames:
        columns[name] = []
    for row in file:
        for field in row:
            columns[field].append(float(row[field]))

for name in columns:
    print(name, columns[name], '\n')
Python is very suited for dealing with things like this and I would need to have a very strong reason to want to do it in C++.
 
  • Like
Likes   Reactions: .Scott and jedishrfu
  • #13
Mark44 said:
You can do I/O in C++ in either of two ways: using stream-based features declared in the iostream header or using the C standard library features in the stdio.h header. fgetc() and fgets() are part of the stdio.h header.
IIRC scanf can cause some overflow issues, but it's been a while since I reviewed this.
 
  • #14
Mayhem said:
IIRC scanf can cause some overflow issues, but it's been a while since I reviewed this.
Yes, scanf() and other C standard library routines can cause buffer overrun problems. For this reason Microsoft provided more secure versions in Visual Studio with an additional parameter that specifies the maximum size of the buffer that will receive the input characters. These MSFT-specific versions include scanf_s(), fscanf_s(), gets_s(), and others. In each of these, the appended _s indicates that these are secure versions. Visual Studio issues a compiler error if you use the deprecated versions scanf(), fscanf(), gets(), and so on.

Regarding @pbuk's comment that Python would be a better choice, I agree, but would also add that C# also includes functions that work with .CSV files.
 
  • Like
Likes   Reactions: FactChecker
  • #15
joshmccraney said:
C++:
    while (datas.good()) { // WHILE SOMETHING IS IN THE FILE KEEP READING IT
        string line; // DEFINE A STRING
        getline(datas, line, ','); // GRAB THE ELEMENT UNTIL THE TEXT SEES A COMMA
        count << line << endl; // PRINT THE ELEMENT
There is a very nice idiom for detecting the end of a file which can simplify and shorten your input code a bit. All C++ input operations (AFAIK) can be used as the condition in an if- or while-statement, giving "true" if the input operation succeeded, and "false" if it failed (because of end of file, or some input error).

C++:
string line;
while (getline (datas, line, ','))
{
    count << line << endl;
}

Both C-style arrays and C++-style vectors can be used to construct matrices. I prefer vectors myself, in which a matrix is a "vector of vectors". Google "c++ vector of vectors" and you'll get examples.

How will your program "know" how many rows and columns your data comes in? Is it simply going to assume a fixed size such as 5 rows and 10 columns, or will it have to "size" the matrix automatically according to the number of lines of data and how many items are in each line?
joshmccraney said:
However, the .csv file has data categorized by columns. But, the above code reads row-wise. Is there a way to save each column of the CSV file as a vector of numbers, say doubles, creating a matrix of data?
If the program has to "size" the matrix automatically, having rows correspond to vectors (with the matrix being a vector of rows) is much more natural than having rows correspond to columns (with the matrix being a vector of columns). If you want to end up with a vector of columns, I think the simplest way would be to read it initially into a vector of rows, then copy the data into a new matrix which is intended to be interpreted as a vector of columns, in effect transposing rows and columns.

To see how to make a vector "grow" so it accommodates your data, look up vector::push_back() (the push_back() member function of the standard vector class).

I'll let others comment on doing this with C-style arrays.
 
Last edited:

Similar threads

  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 15 ·
Replies
15
Views
8K
  • · Replies 5 ·
Replies
5
Views
1K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 31 ·
2
Replies
31
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K