How can I read CSV format in C/C++?

  • Context: C/C++ 
  • Thread starter Thread starter leon1127
  • Start date Start date
  • Tags Tags
    Csv Format
Click For Summary
SUMMARY

This discussion focuses on reading CSV formatted data in C/C++. The user initially attempted to read a CSV file using standard file handling and string manipulation techniques but encountered memory leaks and parsing issues. Suggestions included using C++'s ifstream and istringstream for more efficient parsing, as well as employing getline() to handle custom delimiters. The final solution provided demonstrates how to read and parse CSV data effectively while addressing potential pitfalls in memory management.

PREREQUISITES
  • Understanding of C++ file I/O using ifstream and ofstream
  • Familiarity with C++ Standard Template Library (STL) containers, specifically vector
  • Knowledge of string manipulation in C++ using istringstream and getline()
  • Basic understanding of memory management and common pitfalls in C++
NEXT STEPS
  • Learn advanced C++ file handling techniques with fstream
  • Explore the use of Boost libraries for enhanced string processing capabilities
  • Investigate memory leak detection tools such as Valgrind for C++ applications
  • Study error handling in file I/O operations to improve robustness
USEFUL FOR

C++ developers, data analysts, and software engineers looking to efficiently read and process CSV files in their applications.

leon1127
Messages
484
Reaction score
0
Hi Guys,

I have be trying to read CSV formatted data with C/C++ with no help...

The format is following
8/29/2008,19.54,19.6,19.28,19.38,11204900,19.38
8/28/2008,19.48,19.76,19.38,19.65,11729500,19.65
8/27/2008,19.08,19.45,18.93,19.37,9300100,19.37
8/26/2008,19.12,19.2,19,19.09,8770500,19.09
8/25/2008,19.34,19.4,19.05,19.09,13779300,19.09
8/22/2008,19.11,19.68,19.1,19.53,11087500,19.53
8/21/2008,19.06,19.18,18.87,19.11,16995100,19.11
8/20/2008,19.57,19.65,19.1,19.17,16336900,19.17
8/19/2008,19.78,19.91,19.41,19.42,12837300,19.42

and my code is
// csv_read.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <cstring>
#include<stdio.h>
#include<stdlib.h>
#include <vector>
#include <iostream>
using namespace std;

struct yahoo_data
{
char date[9];
double open;
double high;
double low;
double close;
float volume;
double adj_close;
};

vector<yahoo_data> yhoo;

typedef struct yahoo_data data_format;

int _tmain(int argc, _TCHAR* argv[])
{

FILE *fp;
fp=fopen("table.csv","r");
char bufferCustNum[9];
char bufferCost[40];
int i = 0;

data_format temp;

count<< "before while"<<endl;

while( fgets(bufferCustNum,sizeof(bufferCustNum),fp) != NULL)
{
count<< "i = "<< i << endl;
if(i>0){

strcpy(temp.date, strtok(bufferCustNum,","));
count<< temp.date<<endl;
/*
strcpy(temp.open, (double)strtok(NULL,","));
strcpy(temp.high, strtok(NULL,","));
strcpy(temp.low, strtok(NULL,","));
strcpy(temp.close, strtok(NULL,","));
strcpy(temp.volume, strtok(NULL,","));
strcpy(temp.adj_close, strtok(NULL,","));
*/

temp.open = (double)atof(strtok(NULL,","));
count<< "after open"<<endl;
temp.high = atof(strtok(NULL,","));
count<< "after high"<<endl;
temp.low = atof(strtok(NULL,","));
count<< "after low"<<endl;
temp.close = atof(strtok(NULL,","));
count<< "after close"<<" "<<temp.close<<endl;
// count<< atof(strtok(NULL,","))<<endl;
temp.volume = (int)atof(strtok(NULL,","));
count<< "after adjust close"<<endl;
temp.adj_close = atof(strtok(NULL,","));
count<< "before push"<<endl;
yhoo.push_back(temp);
count<< "before push"<<endl;
}
++i;
}

for (int j=0; j<10; j++)
{

}
return 0;
}


There is some unnecessary part in the code but it should compile under g++. I get some memory leaks... Anyone have idea?
 
Technology news on Phys.org
How are you identifying memory leaks? Some implementations of the STL will cache small blocks of memory. And upon program termination, if those blocks are not used, they are not explicitly deallocated -- which is fine behavior, but will confuse some memory leak detectors.

That said, I notice you forgot to close the FILE* you created... (an fstream would automatically close upon going out of scope)

As an aside... strtok always scares me. Have you considered using stringstream to help with parsing? Or a string processing library, such as spirit from the boost libraries?
 
Yes, a stringstream and getline() will do the job. Remember, getline() isn't just for reading lines. You can specify a different "line terminator" instead of the default '\n'. getline(mystream, mystring, ',') reads from mystream into mystring, stopping at the next ',' (and discarding the ','). Here's a sample program based on your data file:

Code:
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <string>

using namespace std;

int main ()
{
    ifstream inFile ("csv.dat");
    string line;
    int linenum = 0;
    while (getline (inFile, line))
    {
        linenum++;
        cout << "\nLine #" << linenum << ":" << endl;
        istringstream linestream(line);
        string item;
        int itemnum = 0;
        while (getline (linestream, item, ','))
        {
            itemnum++;
            cout << "Item #" << itemnum << ": " << item << endl;
        }
    }

    return 0;
}

It produces the following output:

Code:
Line #1:
Item #1: 8/29/2008
Item #2: 19.54
Item #3: 19.6
Item #4: 19.28
Item #5: 19.38
Item #6: 11204900
Item #7: 19.38

Line #2:
Item #1: 8/28/2008
Item #2: 19.48
Item #3: 19.76
Item #4: 19.38
Item #5: 19.65
Item #6: 11729500
Item #7: 19.65
(etc.)
 
Last edited:
  • Like
Likes   Reactions: OmCheeto
Thank you very much. that sovled my problem!
 
Last edited:
I've tried the above code and it doesn't seem to work quite the same. I've learned that my .csv file ignores the white space and has no linen returns. Therefore, it seems that the last number on the first line and the date on the next line run together, ex.

19.38
8/28/2008

Is there anyway to adjust the file type to add a line return after each line? Any other suggestions to solve this problem?
 
What character does your file use to separate lines? Tell the getline() that controls the while-loop to use that as the separator. For example, if it's a '|', the beginning of the while-loop would look like this:

Code:
while (getline (inFile, line, '|'))
 
Thanks! The code does exactly what I'm looking for too.
 
Hi all,

I'm trying to read a text file which has following format. What I would like to have is:
- Put "25" into number_data variable, "6" into number_place variable, and "1" into timestep variable.
- separate year (e.g 1990), month (e.g 01), and date (e.g 01) into 3 variables
- Handle the mix between number and NA especially in Place3 data

I tried jtbell's code and it worked for the first three lines. But I had no ideas how to just write out what I need and how to handle the rest of the data file.
Could anyone tell me how to do this? Thank you very much.

Number of data points 25
Number of places 6
Timestep 1 hour
Date Hour Place1 Place2 Place3 Place4 Place5 Place6
1990-01-01 1 25.002 NA NA 16.265 6.231 9.680
1990-01-01 2 24.449 NA NA 16.265 6.231 9.551
1990-01-01 3 24.449 NA NA 16.265 6.231 9.551
1990-01-01 4 24.550 NA NA 16.265 6.231 9.551
1990-01-01 5 24.851 NA NA 16.265 6.130 9.551
1990-01-01 6 25.002 NA NA 16.099 6.130 9.421
1990-01-01 7 25.306 NA 29.540 15.933 6.130 9.421
1990-01-01 8 25.357 NA 29.197 15.933 6.130 9.421
1990-01-01 9 25.357 NA 28.856 15.933 6.029 9.389
1990-01-01 10 25.306 NA 28.477 15.769 6.029 9.260
1990-01-01 11 25.306 NA 28.176 15.769 6.029 9.260
1990-01-01 12 25.103 NA 27.913 15.605 6.029 9.132
1990-01-01 13 24.952 NA 27.651 15.605 6.029 9.132
1990-01-01 14 24.901 NA 27.464 15.605 5.905 9.132
1990-01-01 15 24.851 NA 27.315 15.442 6.004 9.132
1990-01-01 16 24.801 NA 27.240 15.442 5.905 9.003
1990-01-01 17 24.700 NA 27.240 15.442 5.905 9.003
1990-01-01 18 24.550 NA 27.278 15.442 5.905 9.003
1990-01-01 19 24.350 NA NA 15.442 5.905 9.003
1990-01-01 20 24.150 NA NA 15.279 5.905 9.003
1990-01-01 21 23.952 NA NA 15.239 5.806 8.875
1990-01-01 22 23.803 NA NA 15.078 5.806 8.875
1990-01-01 23 23.704 NA NA 14.758 5.806 8.875
1990-01-01 24 23.557 NA NA 14.758 5.806 8.875
1990-01-02 1 23.458 NA NA 14.758 5.684 8.875

Regards,

MT
 

Similar threads

Replies
10
Views
2K
  • · Replies 8 ·
Replies
8
Views
6K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
7
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
3
Views
3K
  • · Replies 17 ·
Replies
17
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K