C Program Help: Count & Sort Word Occurrences in File

In summary, the conversation is about a C program that is having trouble counting the number of occurrences of each word in a given file and printing the results in a different file. The program uses a struct to store the words and their occurrences and has a function to compare the words. The program also asks for user input for the input and output file names and uses a while loop to read the input file and a for loop to count the occurrences of each word. Finally, the program outputs a message indicating that it has finished and prints the results in the specified output file.
  • #1
KV305
2
0
C program help!

Hey guys I am having trouble completing this C program. Its suppose to ask the user to enter a file to read from, read the file, count how many times each word appears and print the results in a different file(also user input) arranged from high occurrence.


Heres what i have so far

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef struct AWORD
{
char wordname[50];
int count;
}a_word;

a_word nword(char* words)
{
a_word new;
strcpy (new.wordname, words);
return new;
}

int compare (words.wordname, words[j].wordname)
{
if (words.count < words[j].count)
{
return -1;
}
else if (words.count > words[j].count)
{
return 1;
}
else
{
return 0;
}
}

int main(int argc, char *argv[])
{
int i = 0,j = 0,k = 0, m = 0, count = 0,count2 = 0, count3 = 0, wrdcnt = 1,$
char cur, cur2[50], file1[25], file2[25], list[1000][k], *buffer;
FILE *pFile, *pFile2;

printf("Please type the name of the input file:\n ");
scanf("%s",file1);
printf("%s",file1);
pFile = fopen(file1, "r");

printf("Please type the name of the output file:\n ");
scanf("%s",file2);
pFile2 = fopen(file2, "r");

while (cur != EOF)
{
cur = fgetc(pFile);
if (cur != ' ')
{
list[m][k] = cur;
fprintf(pFile2, "%c", list[m][k]);
k++;
if (cur != '\n' && cur != '\0')
{
count2++;
}
if (cur == '\n' || cur == '\0')
{
wrdcnt++;
}
}
else
{
if (cur == ' ')
{
wrdcnt++;
}
m++;
k = 0;
if (cur == '\0')
{
fprintf(pFile2,"%s", &cur);
}
else
{
fprintf(pFile2, "\n");
}
}
}
fclose(pFile);
fclose(pFile2);

pFile2 = fopen(file2, "r");

a_word words[wrdcnt];

for (i = 0; i < wrdcnt; i++)
{
fgets (cur2, 50, pFile2);
words = nword(cur2);
printf("words[%d] is %s", i, words.wordname);
}
printf("\n");
count3 = wrdcnt;
for (j = 0; j < count3; j++)
{
buffer = words[j].wordname;
for (i = 0; i < wrdcnt; i++)
{
if (strcmp(buffer, words.wordname) == 0)
{
count++;
}
}

words[j].count = count;
count = 0;
}
printf ("\nWord Counter Program Starting...");
printf("\nWord Counter Program Finished. Check %s file\n", file2);

for (j = 0; j < count3; j++)
{
for (i = 0; i < j; i++)
{
if (strcmp(words[j].wordname, words.wordname) == 0)
{
same = 0;
break;
}
else
{
same = 1;
}
}
if (same == 1)
{
printf ("%d\t%s", words[j].count, words[j].wordname);
fprintf(pFile2, "%d %s\n", words[j].count, words[j].wordname);

}

}
fclose(pFile2);
return 0;
}




I haven't sorted yet. we are suppose to use qsort but i don't think my compare function is right. Any help would be greatly appreciated. Thank you.
 
Physics news on Phys.org
  • #2


You have to create a list of words and a number for each as you have done (AWORD).

You then need to do a bubble sort - going through your list of words swapping over any pair that are the wrong way round - you do that multiple times until there are no swaps - then you know it's done.
So you need to compare a pair of integers - the number of times each word occurs in each pair.

The declaration for the compare must use a void *. (it can't cope with overloading)
You will be passing it a pair of integers so you need to cast each to a const void* in the qsort call.
Then your declaration will look like:-

int compare (const void *count1, const void *count2)
{
//then you cast it back to the type you want - you can do that on-the-fly
// I think you can see where it goes from here
if ((int)count1==(int)count2) return 0; //etc.
}
 
  • #3


Ok so far i altered the program and gives no errors but when i try to run it it says
"Segmentation fault"... any ideas?

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef struct AWORD
{
char wordname[50];
int count;
}a_word;

a_word nword(char* words)
{
a_word new;
strcpy (new.wordname, words);
return new;
}

int compare (const void *count1, const void *count2)
{
if ((int)count1==(int)count2) return 0;
if ((int)count1<(int)count2) return -1;
if ((int)count1>(int)count2) return 1;
}

int main(int argc, char *argv[])
{
int i = 0,j = 0,k = 0, m = 0, count = 0,count2 = 0, count3 = 0, wrdcnt = 1,same;
char cur, cur2[50], file1[25], file2[25], list[1000][30], *buffer, sorted[1000];
FILE *pFile, *pFile2;

printf("Please type the name of the input file:\n ");
scanf("%s",file1);
printf("%s",file1);
pFile = fopen(file1, "r");

printf("Please type the name of the output file:\n ");
scanf("%s",file2);
pFile2 = fopen(file2, "r");

while (cur != EOF)
{
cur = fgetc(pFile);
if (cur != ' ')
{
list[m][k] = cur;
fprintf(pFile2, "%c", list[m][k]);
k++;
if (cur != '\n' && cur != '\0')
{
count2++;
}
if (cur == '\n' || cur == '\0')
{
wrdcnt++;
}
}
else
{
if (cur == ' ')
{
wrdcnt++;
}
m++;
k = 0;
if (cur == '\0')
{
fprintf(pFile2,"%s", &cur);
}
else
{
fprintf(pFile2, "\n");
}
}
}
fclose(pFile);
fclose(pFile2);

pFile2 = fopen(file2, "r");

a_word words[wrdcnt];

for (i = 0; i < wrdcnt; i++)
{
fgets (cur2, 50, pFile2);
words = nword(cur2);
printf("words[%d] is %s", i, words.wordname);
}
printf("\n");
count3 = wrdcnt;
for (j = 0; j < count3; j++)
{
buffer = words[j].wordname;
for (i = 0; i < wrdcnt; i++)
{
if (strcmp(buffer, words.wordname) == 0)
{
count++;
}
}

words[j].count = count;
count = 0;
}
printf ("\nWord Counter Program Starting...");
printf("\nWord Counter Program Finished. Check %s file\n", file2);

for (j = 0; j < count3; j++)
{
for (i = 0; i < j; i++)
{
if (strcmp(words[j].wordname, words.wordname) == 0)
{
same = 0;
break;
}
else
{
same = 1;
}
}
if (same == 1)
{
printf ("%d\t%s", words[j].count, words[j].wordname);
fprintf(pFile2, "%d %s\n", words[j].count, words[j].wordname);
}

}
fclose(pFile2);
pFile2 = fopen(file2, "r");
m=0;
while (pFile != NULL)
{
fscanf (pFile2 , "%s" , &sorted[m]);
m++;
}
qsort(sorted,m,sizeof(int),compare);
fclose(pFile2);
pFile2 = fopen(file2, "w");
fprintf(pFile2,"%s", sorted);
return 0;
}
 
  • #4


Are you in the same class as the person who started this thread?

If you're getting a segmentation fault, you are probably attempting to access memory (via a pointer) that your program is not allowed to access. It's time to learn how to use a debugger.
 
  • #5


KV305 said:
when i try to run it it says
"Segmentation fault"... any ideas?

Put extra printf() statements in the program so you can pin down exactly where the segmentation fault is occurring:

"I got to checkpoint #1.
I got to checkpoint #2.
I got to checkpoint #3.
ERROR: Segmentation fault."
 
  • #6


It means you are attempting to read or write to a part of the memory outside your own program. Usually a pointer error (accessing memory location zero because a pointer is null for example).

Try temporarily commenting out sections of your program till you locate the line giving the error - then look carefully at any pointers involved.

Doesn't your debugger automatically stop at error points?
 

1. How does the program count words in a file?

The program uses a loop to read each word in the file and stores them in an array. Then, it compares each word with the other words in the array and increments the count for each occurrence of the word.

2. How does the program handle punctuation and special characters?

The program uses string manipulation functions to remove any punctuation or special characters from the words before counting them. This ensures that only the actual words are counted and not any symbols or numbers.

3. Can the program handle large files?

Yes, the program can handle large files as it dynamically allocates memory for the array based on the number of words in the file. This ensures that the program does not run out of memory when processing large files.

4. How does the program sort the words in alphabetical order?

The program uses the selection sort algorithm to sort the words in alphabetical order. This algorithm compares each word in the array with the following words and swaps them if they are not in the correct alphabetical order. This process is repeated until the entire array is sorted.

5. Can the program handle different file formats?

Yes, the program can handle different file formats as long as the file contains text. It can handle files with .txt, .doc, .csv, and other common file extensions. However, if the file contains images or other non-text data, the program will not be able to process it.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
4
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
6K
  • Engineering and Comp Sci Homework Help
Replies
8
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
9
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
21
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
4
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
3
Views
1K
Back
Top