C++ help- my code reads from small files but not large

In summary, the individual is experiencing an issue with their code when trying to read a large csv file with data containing date, time, and wind speed. The code works with smaller files, but when using a 6,000kb file, it misses every second date and time entry. The individual has tried splitting the file and using a debugger, but is still experiencing the issue. They have also attempted to improve readability by creating a subroutine, but it did not solve the issue.
  • #1
davidsmith315
3
0
Hello, I'm trying to make part of a code that reads a csv file with data containing the date and time, and the wind speed at that time, then does various output things.
When I copy and paste 1000 entriesinto a csv file and ru it, it works great, but if I use the file that's about 6,000kb it misses every second date and time part out, code anyone help me with this?
I'm a complete novice to programming- this is part of my Electrical Engineering course.

heres my code:
Code:
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <string>
#include <ctime>
#include <FileHC.h>

using namespace std;

int main ()
{	  ofstream outFile("C:\\WindSpeedAnswer.csv");
  if (!outFile) {
	  cout<< "error opening file"<<endl;
	  return -1;
  }
  
  
    ifstream inFile ("C:\\Copy of Kirkwall_23_hm_proc15.csv");
    string line;
    int linenum = 0;
   if (inFile.is_open())
  {
	  double randomMax;
	  double randomMin;
	  cout<<"would you like to add a random varience, enter 'yes' if you do "<<endl;
	 string Question;
	cin >> Question;
	string answerYes= "yes";
	if (Question==answerYes){
	cout<<"enter the max value of random varience "<<endl;
	cin>>randomMax;
	randomMax=randomMax-1;

	
	cout<<"enter the min value of random varience "<<endl;
	cin>>randomMin;
	randomMin=randomMin-1;}

	while (getline (inFile, line))
    {
        linenum++;
       // cout << "\nLine #" << linenum << ":" << endl;
        istringstream linestream(line);
        string item;
        int itemnum = 0;
		
        while (getline (linestream, item, ','))
        {
            itemnum++;
			double WindSpeed=0;
		string DateAndTime;
	
              //cout << "Item #" << itemnum << ": " << item << endl;
			if (itemnum%2 == 1) {
		
			DateAndTime=item;

				
		string replace= "-99999";	while(DateAndTime==replace){
			DateAndTime= "Data Missing";}

	cout<<"at "<<DateAndTime<<endl;
		outFile<< DateAndTime<<","	;}
			
		
	if (itemnum%2 == 0) {
		
	WindSpeed=atof(item.c_str());
	if(WindSpeed==-99999){
			
	WindSpeed=9;
			}
	if (Question==answerYes){
	for(int index=0; index<itemnum; index++){
	double X = randomMax+ rand() * (randomMax - randomMin) / RAND_MAX;
		if (randomMin<X<randomMax){
		WindSpeed=X;}
			}}
cout<<"the wind speed is "<<WindSpeed<<","<<endl;
		
outFile<<WindSpeed<<","<<endl;
//outFile.write(WindSpeed); 
			}
		}
			}
		
	 inFile.close();
 }

  else cout << "Unable to open file"; 
  

  
	  outFile.close();
  
  char stopchar;
  cin>>stopchar;
  return 0;
}
 
Last edited by a moderator:
Technology news on Phys.org
  • #2
Here's something to try: See if it still doesn't work when you only take the first 100 lines or so of the 6,000kb file. If it still doesn't work, check the file for errors. It's much easier to check a 100-line file, and it's quite possible that there are a few thingies that make your program go weird.
 
  • #3
Re Hobin:
Thanks fo reply
It work great if i take the firs 100 lines, or 1000 lines- its only when I run the 6,oookb file it doesntwork
 
  • #4
davidsmith315 said:
It work great if i take the firs 100 lines, or 1000 lines- its only when I run the 6,oookb file it doesntwork

That's odd. To be perfectly honest, I can't find anything in your code that would make it stop working at some arbitrary amount of data - have you tried running your code on a different machine? Try splitting the file in the middle (if that's possible), if it works, take 3/4 of the file, if it doesn't, take 1/4 - repeat until you find the boundary where it stops working.
 
  • #5
Sounds like you have a memory leak somewhere, for example objects are being created inside the loop, but never deleted.

For better readability, I would put the treatment of each complete line (everything inside the while(getline) loop) into a subroutine.
 
  • #6
Thanks for replies, I tried using half the input file and it failed, but using 1/4 f it worked perfectly.

Re M Quck- "For better readability, I would put the treatment of each complete line (everything inside the while(getline) loop) into a subroutine"
Sorry to ask but how would I go about doing this?
I tried this:
if (itemnum%2 == 1) {

DateAndTime=item;
string replace= "-99999";
while(DateAndTime==replace){
DateAndTime= "Data Missing";}
cout<<"at "<<DateAndTime<<endl;
}
if (itemnum%2 == 0) {
WindSpeed=atof(item.c_str());
if(WindSpeed==-99999){
WindSpeed=9;
}
if (Question==answerYes){
for(int index=0; index<itemnum; index++){
double X = randomMax+ rand() * (randomMax - randomMin) / RAND_MAX;
if (randomMin<X<randomMax){
WindSpeed=X;}
}}

cout<<"the wind speed is "<<WindSpeed<<","<<endl;

outFile<<DateAndTime<<","<<WindSpeed<<","<<endl;



It nearly worked- except it left out the time and date part- the time and date row was blank, but the WindSpeed row was output perfectly
 
  • #7
davidsmith315 said:
Thanks for replies, I tried using half the input file and it failed, but using 1/4 f it worked perfectly.

Very good, now try (3/4)/2 part of the file. :wink:

I hope you see where I'm going with this. By cutting what's left in half every time, you could theoretically very quickly go to the point where you come to the specific line where your file fails to work properly. Going to that extreme is probably unnecessary, but as you 'zoom in' on the part where it fails to work, take a very good look at possible anomalies in the file.
 
  • #8
Couldn't you use a source level debugger to see where the failure occurs? If you have windows, you can get microsoft visual c++ express for free, and it includes a source level debugger.
 
  • #9
^That, too. Alternatively, if you use GCC, you can obviously download and use the GDB.
 
  • #10
Whenever this happens for me, I just use GDB and run the code with the proper arguments and print the stack trace to see where the fault occurred.

Doing what the previous user suggested above in terms of essentially binary searching for the line where the code stops at is helpful but won't solve your problem if the issue is a memory leak. That style of debugging will only be helpful if that's the line of input error.

GDB will tell you if the input is faulty or if you have a memory leak. Also consider running your executable with memcheck to see if you have memory leaks.
 

1. Why is my C++ code only reading small files but not large ones?

There could be several reasons for this issue. One possibility is that your code has a memory limitation and can only process a certain amount of data at a time. Another possibility is that your code is encountering errors or bugs when trying to read larger files. It is important to check for any error messages or debug your code to find the root cause of the issue.

2. How can I increase the memory limit for my C++ code?

The memory limit for a C++ code can be increased by using dynamic memory allocation techniques such as pointers and dynamic arrays. This allows your code to allocate memory as needed and can help with processing larger files. However, it is important to properly manage and free the allocated memory to prevent memory leaks.

3. Are there any specific libraries or functions I should use for reading large files in C++?

There are various libraries and functions available for reading large files in C++. Some popular options include the fstream library, fread() and fopen() functions, and the ifstream class. It is important to research and choose the best option for your specific needs.

4. How can I optimize my code for reading large files in C++?

One way to optimize your code for reading large files in C++ is to use efficient algorithms and data structures. For example, using a hash table or binary search tree can help with faster data retrieval. Additionally, minimizing the number of file reads and writes can also improve performance.

5. Could the issue be with the data in my large file rather than my C++ code?

Yes, it is possible that the issue could be with the data in your large file. It is important to thoroughly check the formatting and structure of the file to ensure it is compatible with your code. You may also need to handle any potential errors or exceptions that could arise from the data in the file.

Similar threads

  • Programming and Computer Science
Replies
12
Views
1K
  • Programming and Computer Science
Replies
6
Views
8K
  • Programming and Computer Science
Replies
8
Views
1K
  • Programming and Computer Science
Replies
5
Views
2K
Replies
10
Views
904
  • Programming and Computer Science
Replies
8
Views
2K
  • Programming and Computer Science
Replies
12
Views
1K
  • Programming and Computer Science
Replies
30
Views
2K
  • Programming and Computer Science
Replies
4
Views
5K
  • Programming and Computer Science
Replies
5
Views
4K
Back
Top