How to Sort Word Frequencies in a C Program?

In summary: Remember to close the file when you are done using it.In summary, the conversation discusses the development of a C program that counts the frequency of words in a large text file. It involves implementing an algorithm to count the words, asking for the filename and output file, and ordering the output file by word frequency. The attempt at a solution includes code for reading and counting the words, as well as sorting the array of structs using qsort(). Assistance is requested for sorting and outputting the sorted data to the specified file.
  • #1
lynk26
1
0

Homework Statement



Develop a C program that to count how many times each word appears in a large text file. Your program must read words from a file and output the number of times each word of the file appears.
1. Implement an algorithm to count how many times each word appears in a large text file.
2. Ask the name of the file to count the words
3. The output of the program is a file. The name of the file must be asked by the program
4. The output file should be ordered by frequency of the words

Homework Equations



N/A

The Attempt at a Solution



Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef struct WORD
{
    char wordname[50];
    int count;
}word_t;

word_t new_word(char* words)
{
    word_t new_w;
    strcpy (new_w.wordname, words);
    return new_w;
}

int main(int argc, char *argv[])
{
    FILE *fp;
    FILE* pInput;
    FILE* pOutput;
    char buffer[50];
    char buffer1;
    char filename[50];
    char output[50];
    char array1[1000][30];
    char* tempword;
    int i = 0;
    int j = 0;
    int numcount = 0;
    int i1 = 0;
    int j1 = 0;
    int tempcount;
    int char_count = 0;
    int word_count = 1;
    bool check = false;
    bool inlist;

    printf("Enter the name of the file to open (include the .txt): ");
    gets(filename);
    printf("Enter the name of the output file (include the .txt): ");
    gets(output);

    pInput = fopen(filename, "r");
    pOutput = fopen("temporary.txt","w");
    fp = fopen(output, "w");
    system("cls");

    while (buffer1 != EOF)
    {
        buffer1 = fgetc(pInput);
        if (buffer1 != ' ')
        {
            array1[j1][i1] = buffer1;
            fprintf(pOutput, "%c", array1[j1][i1]);
            i1++;
            if (buffer1 != '\n' && buffer1 != '\0')
            {
                char_count++;
            }
            if (buffer1 == '\n' || buffer1 == '\0')
            {
                word_count++;
            }
        }
        else
        {
            if (buffer1 == ' ')
            {
                word_count++;
            }
            j1++;
            i1 = 0;
            if (buffer1 == '\0')
                fprintf(pOutput, '\0');
            else
                fprintf(pOutput, "\n");
        }
    }
    fclose(pInput);
    fclose(pOutput);

    fp = fopen("temporary.txt", "r");

    word_t words[word_count];

    for (i = 0; i < word_count; i++)
    {
        fgets (buffer, 50, fp);
        words[i] = new_word(buffer);
        printf("words[%d] is %s", i, words[i].wordname);
    }
    fclose (fp);
    printf("\n");
    tempcount = word_count;
    for (j = 0; j < tempcount; j++)
    {
        tempword = words[j].wordname;
        for (i = 0; i < word_count; i++)
        {
            if (strcmp(tempword, words[i].wordname) == 0)
            {
                numcount++;
            }
        }

        words[j].count = numcount;
        numcount = 0;
    }
    printf ("\nPrinting results...\n\n");
    printf("\n#\tWord\n");
    printf("-\t----\t\n");

    for (j = 0; j < tempcount; j++)
    {
        for (i = 0; i < j; i++)
        {
            if (strcmp(words[j].wordname, words[i].wordname) == 0)
            {
                inlist = true;
                break;
            }
            else
            {
                inlist = false;
            }
        }
        if (inlist == false)
        {
            printf ("%d\t%s", words[j].count, words[j].wordname);
            fprintf(fp, "%d\t%s", words[j].count, words[j].wordname);
        }
    }
    printf("\n");
    return 0;
}

I need help with the sorting of the structs, and outputting the sorted structs to a file.

Here's the sample text file that I've been working with:

in the beginning god created the heaven and the earth
and the Earth was without form and void and darkness was upon the face
of the deep and the spirit of god moved upon the face of the waters
and god said let there be light and there was light
and god saw the light that it was good and god divided the light from
the darkness
and god called the light day and the darkness he called night and the
evening and the

I can read the file, can count the number of times each word appears, but I'm not quite sure how to use the qsort function to sort an array of structs, nor does the output want to go into the specified file. I've looked all over the internet but still can't seem to understand how it works. Can anyone show me how to at least get the sorting done?
 
Physics news on Phys.org
  • #2
For the sorting part, qsort() is set up to sort an arbitrary array. One of the parameters is a pointer to a comparison function that you supply. The comparison function will need to decide whether the frequence of the word in words.wordname is "less than" the frequency of the word in words[j].wordname. At this level, all you need to do is compare words.count with words[j].count. The word with the higher frequency should go before the word with lower frequency I would think.

Once you have the array sorted by frequency, print (using fprintf) the words and their frequencies to the output file.
 

1. How do I sort an array of structs in C?

To sort an array of structs in C, you can use the qsort() function from the standard library. This function takes in the array, the size of each element, the number of elements, and a comparison function as parameters. The comparison function should take in two elements and return a negative value if the first element is smaller, a positive value if the second element is smaller, and 0 if they are equal. This function will then sort the array in ascending order based on the comparison function.

2. Can I sort an array of structs based on a specific field?

Yes, you can specify which field to use for sorting in the comparison function. For example, if you have a struct Person with fields name and age, you can sort the array of Person structs based on the age field by comparing the age values in the comparison function.

3. How do I handle sorting if my struct contains pointers?

If your struct contains pointers, you will need to use a comparison function that dereferences the pointers and compares the values they point to. This is because the qsort() function only sorts the array of pointers, not the values they point to. You will also need to be careful with memory management when sorting an array of structs with pointers.

4. Can I use a custom sorting algorithm instead of qsort()?

Yes, you can use a custom sorting algorithm to sort an array of structs. However, you will need to write the algorithm yourself and make sure it can handle structs as elements. This may be more time-consuming and error-prone compared to using the built-in qsort() function.

5. Do I need to include any additional libraries for sorting array structs?

No, you do not need to include any additional libraries for sorting array structs in C. The qsort() function is included in the standard library stdlib.h, so you just need to include this header file in your code.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
8
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
4
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
3
Views
883
  • Engineering and Comp Sci Homework Help
Replies
21
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
11
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
9
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
3K
  • Engineering and Comp Sci Homework Help
Replies
5
Views
1K
Back
Top