C/C++ How can I read CSV format in C/C++?

  • Thread starter Thread starter leon1127
  • Start date Start date
  • Tags Tags
    Csv Format
AI Thread Summary
The discussion centers on reading CSV formatted data in C/C++. A user shares their code, which attempts to read stock data from a CSV file but encounters issues, including memory leaks and improper parsing of data. Suggestions are made to improve the code, such as using `stringstream` and `getline()` for parsing instead of `strtok`, which can be problematic. A sample program is provided that demonstrates how to read and parse CSV data correctly. The user also faces a specific issue with their CSV file format, where lines do not end with newline characters, causing data to run together. Solutions are proposed to handle this by adjusting the line separator in the `getline()` function. Additionally, another user seeks help with a different data format, asking for guidance on extracting specific variables and handling mixed data types, indicating a broader interest in parsing and data manipulation in C/C++.
leon1127
Messages
484
Reaction score
0
Hi Guys,

I have be trying to read CSV formatted data with C/C++ with no help...

The format is following
8/29/2008,19.54,19.6,19.28,19.38,11204900,19.38
8/28/2008,19.48,19.76,19.38,19.65,11729500,19.65
8/27/2008,19.08,19.45,18.93,19.37,9300100,19.37
8/26/2008,19.12,19.2,19,19.09,8770500,19.09
8/25/2008,19.34,19.4,19.05,19.09,13779300,19.09
8/22/2008,19.11,19.68,19.1,19.53,11087500,19.53
8/21/2008,19.06,19.18,18.87,19.11,16995100,19.11
8/20/2008,19.57,19.65,19.1,19.17,16336900,19.17
8/19/2008,19.78,19.91,19.41,19.42,12837300,19.42

and my code is
// csv_read.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <cstring>
#include<stdio.h>
#include<stdlib.h>
#include <vector>
#include <iostream>
using namespace std;

struct yahoo_data
{
char date[9];
double open;
double high;
double low;
double close;
float volume;
double adj_close;
};

vector<yahoo_data> yhoo;

typedef struct yahoo_data data_format;

int _tmain(int argc, _TCHAR* argv[])
{

FILE *fp;
fp=fopen("table.csv","r");
char bufferCustNum[9];
char bufferCost[40];
int i = 0;

data_format temp;

cout<< "before while"<<endl;

while( fgets(bufferCustNum,sizeof(bufferCustNum),fp) != NULL)
{
cout<< "i = "<< i << endl;
if(i>0){

strcpy(temp.date, strtok(bufferCustNum,","));
cout<< temp.date<<endl;
/*
strcpy(temp.open, (double)strtok(NULL,","));
strcpy(temp.high, strtok(NULL,","));
strcpy(temp.low, strtok(NULL,","));
strcpy(temp.close, strtok(NULL,","));
strcpy(temp.volume, strtok(NULL,","));
strcpy(temp.adj_close, strtok(NULL,","));
*/

temp.open = (double)atof(strtok(NULL,","));
cout<< "after open"<<endl;
temp.high = atof(strtok(NULL,","));
cout<< "after high"<<endl;
temp.low = atof(strtok(NULL,","));
cout<< "after low"<<endl;
temp.close = atof(strtok(NULL,","));
cout<< "after close"<<" "<<temp.close<<endl;
// cout<< atof(strtok(NULL,","))<<endl;
temp.volume = (int)atof(strtok(NULL,","));
cout<< "after adjust close"<<endl;
temp.adj_close = atof(strtok(NULL,","));
cout<< "before push"<<endl;
yhoo.push_back(temp);
cout<< "before push"<<endl;
}
++i;
}

for (int j=0; j<10; j++)
{

}
return 0;
}


There is some unnecessary part in the code but it should compile under g++. I get some memory leaks... Anyone have idea?
 
Technology news on Phys.org
How are you identifying memory leaks? Some implementations of the STL will cache small blocks of memory. And upon program termination, if those blocks are not used, they are not explicitly deallocated -- which is fine behavior, but will confuse some memory leak detectors.

That said, I notice you forgot to close the FILE* you created... (an fstream would automatically close upon going out of scope)

As an aside... strtok always scares me. Have you considered using stringstream to help with parsing? Or a string processing library, such as spirit from the boost libraries?
 
Yes, a stringstream and getline() will do the job. Remember, getline() isn't just for reading lines. You can specify a different "line terminator" instead of the default '\n'. getline(mystream, mystring, ',') reads from mystream into mystring, stopping at the next ',' (and discarding the ','). Here's a sample program based on your data file:

Code:
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <string>

using namespace std;

int main ()
{
    ifstream inFile ("csv.dat");
    string line;
    int linenum = 0;
    while (getline (inFile, line))
    {
        linenum++;
        cout << "\nLine #" << linenum << ":" << endl;
        istringstream linestream(line);
        string item;
        int itemnum = 0;
        while (getline (linestream, item, ','))
        {
            itemnum++;
            cout << "Item #" << itemnum << ": " << item << endl;
        }
    }

    return 0;
}

It produces the following output:

Code:
Line #1:
Item #1: 8/29/2008
Item #2: 19.54
Item #3: 19.6
Item #4: 19.28
Item #5: 19.38
Item #6: 11204900
Item #7: 19.38

Line #2:
Item #1: 8/28/2008
Item #2: 19.48
Item #3: 19.76
Item #4: 19.38
Item #5: 19.65
Item #6: 11729500
Item #7: 19.65
(etc.)
 
Last edited:
  • Like
Likes OmCheeto
Thank you very much. that sovled my problem!
 
Last edited:
I've tried the above code and it doesn't seem to work quite the same. I've learned that my .csv file ignores the white space and has no linen returns. Therefore, it seems that the last number on the first line and the date on the next line run together, ex.

19.38
8/28/2008

Is there anyway to adjust the file type to add a line return after each line? Any other suggestions to solve this problem?
 
What character does your file use to separate lines? Tell the getline() that controls the while-loop to use that as the separator. For example, if it's a '|', the beginning of the while-loop would look like this:

Code:
while (getline (inFile, line, '|'))
 
Thanks! The code does exactly what I'm looking for too.
 
Hi all,

I'm trying to read a text file which has following format. What I would like to have is:
- Put "25" into number_data variable, "6" into number_place variable, and "1" into timestep variable.
- Seperate year (e.g 1990), month (e.g 01), and date (e.g 01) into 3 variables
- Handle the mix between number and NA especially in Place3 data

I tried jtbell's code and it worked for the first three lines. But I had no ideas how to just write out what I need and how to handle the rest of the data file.
Could anyone tell me how to do this? Thank you very much.

Number of data points 25
Number of places 6
Timestep 1 hour
Date Hour Place1 Place2 Place3 Place4 Place5 Place6
1990-01-01 1 25.002 NA NA 16.265 6.231 9.680
1990-01-01 2 24.449 NA NA 16.265 6.231 9.551
1990-01-01 3 24.449 NA NA 16.265 6.231 9.551
1990-01-01 4 24.550 NA NA 16.265 6.231 9.551
1990-01-01 5 24.851 NA NA 16.265 6.130 9.551
1990-01-01 6 25.002 NA NA 16.099 6.130 9.421
1990-01-01 7 25.306 NA 29.540 15.933 6.130 9.421
1990-01-01 8 25.357 NA 29.197 15.933 6.130 9.421
1990-01-01 9 25.357 NA 28.856 15.933 6.029 9.389
1990-01-01 10 25.306 NA 28.477 15.769 6.029 9.260
1990-01-01 11 25.306 NA 28.176 15.769 6.029 9.260
1990-01-01 12 25.103 NA 27.913 15.605 6.029 9.132
1990-01-01 13 24.952 NA 27.651 15.605 6.029 9.132
1990-01-01 14 24.901 NA 27.464 15.605 5.905 9.132
1990-01-01 15 24.851 NA 27.315 15.442 6.004 9.132
1990-01-01 16 24.801 NA 27.240 15.442 5.905 9.003
1990-01-01 17 24.700 NA 27.240 15.442 5.905 9.003
1990-01-01 18 24.550 NA 27.278 15.442 5.905 9.003
1990-01-01 19 24.350 NA NA 15.442 5.905 9.003
1990-01-01 20 24.150 NA NA 15.279 5.905 9.003
1990-01-01 21 23.952 NA NA 15.239 5.806 8.875
1990-01-01 22 23.803 NA NA 15.078 5.806 8.875
1990-01-01 23 23.704 NA NA 14.758 5.806 8.875
1990-01-01 24 23.557 NA NA 14.758 5.806 8.875
1990-01-02 1 23.458 NA NA 14.758 5.684 8.875

Regards,

MT
 

Similar threads

Back
Top