Assistance needed writing a C program

diredragon · May 21, 2017

Homework Statement

My task is to write a program that reads from the .txt file into a list which contains two parts, the part of the text from the file and a pointer to the next element. Here are the specifications of the problem:
When reading from the file you should separate the letters from the numbers in your list (Space is a default separator). For example:
If your .txt file contains the following: I am 39 years old and my username is 7yghh67f[]\.
Your code should store: (comas here represent the next list element)
I, am, 39, years, old, and, my, username, is, 7, yghh, 67, f, []\

2. Homework Equations
3. The Attempt at a Solution
I hope you understand what i need help with. So far i have written an algorithm to separate the .txt file text into blocks of words not including spaces.

C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>typedef struct words {
    char *text;
    struct words *nextp;
} Words;
Words* read(FILE *document, Words *p) {
    int i = 0, j;
    char c;
    Words *new = p;

    new = calloc(1, sizeof(Words));
    while ((c = fgetc(document)) != EOF) {
        if (c == ' ') {
            new->text[i] = '\0';
            i = 0;
            new = calloc(1, sizeof(Words));
            new = new->nextp;
        }
        else {
            new->text[i] = c;
            new->nextp = NULL;
            i++;
        }
    }
    return p;
}

void print(Words *p) {
    while (p) {
        puts(p->text);
        p = p->nextp;
    }
}

int main(int barg, const char *varg[]) {

    FILE *document = NULL;
    Words *headt = NULL;

    document = fopen(varg[1], "r");

    headt = read(document, headt);
    print(headt);

    fclose(document);

}

I named the .txt document . I do not know why i had to use varg[1] instead of the document name but they said i should. When i run this in visual studio it gives me some kind of unresolved external symbol error. What do you think of it so far? Is it a good idea now to create a new structure in which i will write the second part of the problem (the separation of symbols and numbers from letters)? Or could have i done this immediately?

Mark44 · May 21, 2017

diredragon said:

I do not know why i had to use varg[1] instead of the document name but they said i should.

Because the intent is for you to give the filename on the command line when you run your program, something like this:

Code:

c:\myprog document.txt

diredragon said:

When i run this in visual studio it gives me some kind of unresolved external symbol error.

What's the exact error message, including the line number where it occurs? This sounds like a linker error, so probably won't give a line number, but it should indicate what symbol it's having problems with.

SlowThinker · May 21, 2017

diredragon said:

What do you think of it so far? Is it a good idea now to create a new structure in which i will write the second part of the problem (the separation of symbols and numbers from letters)? Or could have i done this immediately?

There are a few issues...
1. It's unfortunate to call a variable "new" because it's a reserved word in C++. Try to find some other name.

2. You allocate Words but its text member points to a NULL. It might be best to change its definition to

C:

typedef struct words {
    char text[20];
    struct words *nextp;
} Words;

or, during parsing, make a temporary variable that is much longer, and use

C:

new->text=strdup(temporary);

once a word is finished. Of course you should prevent writing more characters into the variable than it can hold, in both cases.

3. You should set new->next=NULL only once, right after it's allocated (possibly together with new->text=calloc(1, 20) ). Or don't do it at all, since calloc already returns a zeroed memory.

4. The exchange where you go from old to new Words is backwards.

C:

            new->text[i] = '\0';
            i = 0;
            //wrong new = calloc(1, sizeof(Words));
            new->nextp = calloc(1, sizeof(Words)); //correct but again, check initialization
            new = new->nextp;

5. Yes I would improve the parser to split the words from numbers. I like to write it along these lines:

C:

typedef enum {PS_START, PS_LETTERS, PS_NUMBERS, PS_SYMBOLS} ParserState;
ParserState classify(char c) {
  if (c==' ') return PS_START;
  if ((c>='0') && (c<=9)) return PS_NUMBERS;
  and so on
 }

...
ParserState state=PS_START;
while ((c=fgetc(document))!=EOF)
 {
  ParserState next=classify(c);
  if (state!=next)
    if (state!=PS_START)
     { /* finish the previous word */ }
  state=next;
 }

It will need some work but the general structure should be OK.

diredragon · May 22, 2017

SlowThinker said:
There are a few issues...
1. It's unfortunate to call a variable "new" because it's a reserved word in C++. Try to find some other name.

2. You allocate Words but its text member points to a NULL. It might be best to change its definition to
C:
typedef struct words {
    char text[20];
    struct words *nextp;
} Words;
or, during parsing, make a temporary variable that is much longer, and use
C:
new->text=strdup(temporary);
once a word is finished. Of course you should prevent writing more characters into the variable than it can hold, in both cases.

3. You should set new->next=NULL only once, right after it's allocated (possibly together with new->text=calloc(1, 20) ). Or don't do it at all, since calloc already returns a zeroed memory.

4. The exchange where you go from old to new Words is backwards.
C:
            new->text[i] = '\0';
            i = 0;
            //wrong new = calloc(1, sizeof(Words));
            new->nextp = calloc(1, sizeof(Words)); //correct but again, check initialization
            new = new->nextp;
5. Yes I would improve the parser to split the words from numbers. I like to write it along these lines:
C:
typedef enum {PS_START, PS_LETTERS, PS_NUMBERS, PS_SYMBOLS} ParserState;
ParserState classify(char c) {
  if (c==' ') return PS_START;
  if ((c>='0') && (c<=9)) return PS_NUMBERS;
  and so on
 }

...
ParserState state=PS_START;
while ((c=fgetc(document))!=EOF)
 {
  ParserState next=classify(c);
  if (state!=next)
    if (state!=PS_START)
     { /* finish the previous word */ }
  state=next;
 }
It will need some work but the general structure should be OK.

I implemented the things you said and sort of finished the code which in visual studio builds succesfully but can't run. Here is the code:

C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef enum { PS_START, PS_LETTERS, PS_NUMBERS, PS_SYMBOLS } ParserState;

typedef struct words {
    char *text;
    struct words *nextp;
} Words;

ParserState classify(char c) {
    if (c == ' ')
        return PS_START;
    if ((c >= 'a') && (c <= 'z'))
        return PS_LETTERS;
    if ((c >= '0') && (c <= '9'))
        return PS_NUMBERS;
    else
        return PS_SYMBOLS;
}Words* read(FILE *document, Words *p) {
    ParserState state = PS_START, next;
    char c; int i = 1, j = 0;
    Words *newp, *oldp;

    while ((c = fgetc(document)) != EOF) {
        next = classify(c);
        if (state != next) {
            if (state != PS_START) {
                newp->text[i] = '\0';
                newp->nextp = NULL;                         //This part says to me that i have a different char but
                if (p == NULL) {                            //that I am not at the beginning of the new list so i need to finish with it
                    p = newp;
                }
                else
                    oldp->nextp = newp;
                oldp = newp;
            }                                       
            if (c != ' ') {
                newp = calloc(1, sizeof(Words));    //Here since my char is of different type and i'm
                newp->text[0] = c;                  //at the beggining of the new text block i just allocate and add that char
                i = 1;                              //to the first place.
                state = next;
                continue;
            }
            else {
                state = next;
                continue;
            }
        }                                         
        //at this point i know my char is of the same type and i just add it to the rest of them char's
        newp->text[i] = c;
        i++;
    }
    return p;
}

void print(Words *p) {
    while (p) {
        puts(p->text);
        p = p->nextp;
    }
}

int main(int barg, const char *varg[]) {

    FILE *document = NULL;
    Words *head = NULL;
    int t;

    document = fopen(varg[1], "r");

    head = read(document, head);
    printf("Enter something\n");
    scanf("%d\n", t);
    print(head);

    fclose(document);

}

I annotated the important parts and i don't get why it doesn't run. It seems to work fine when i go through it by example on pen and paper. Maybe there's something wrong when a move pointers to the next element? Or something with the reasoning i used?

SlowThinker · May 22, 2017

Does it compile? What does it do when you run it? Can you step through the program? In Visual Studio, go to menu, Debug, Start Debugging, then use keys F10 and F11 to step through the code and see if it works as expected. When you hover mouse over a variable, it will show its current value.

C:

            if (c != ' ') {
                newp = calloc(1, sizeof(Words));    //Here since my char is of different type and i'm
                newp->text[0] = c;                  //at the beggining of the new text block i just allocate and add that char
                i = 1;                              //to the first place.
                state = next;
                continue;

I would feel better if you could change the first line of this a bit, but the important thing is that you still need to allocate memory for the text:

C:

            if (next != PS_START) {
                newp = calloc(1, sizeof(Words));
                newp->text=calloc(1, 30); //This line is super important. 30=maximum length of a word
                newp->text[0] = c;
                i = 1;
                state = next;
                continue;

As I said, there are various ways to deal with the hard-coded limit of 30, but you must allocate the memory somewhere.

The juggling you perform with newp, oldp and p seems a bit more complicated than it needs to be, but it might work fine.
It's best to use debugger to step through this and see if it works the way you planned.

diredragon · May 22, 2017

SlowThinker said:
Does it compile? What does it do when you run it? Can you step through the program? In Visual Studio, go to menu, Debug, Start Debugging, then use keys F10 and F11 to step through the code and see if it works as expected. When you hover mouse over a variable, it will show its current value.
C:
            if (c != ' ') {
                newp = calloc(1, sizeof(Words));    //Here since my char is of different type and i'm
                newp->text[0] = c;                  //at the beggining of the new text block i just allocate and add that char
                i = 1;                              //to the first place.
                state = next;
                continue;
I would feel better if you could change the first line of this a bit, but the important thing is that you still need to allocate memory for the text:
C:
            if (next != PS_START) {
                newp = calloc(1, sizeof(Words));
                newp->text=calloc(1, 30); //This line is super important. 30=maximum length of a word
                newp->text[0] = c;
                i = 1;
                state = next;
                continue;
As I said, there are various ways to deal with the hard-coded limit of 30, but you must allocate the memory somewhere.

The juggling you perform with newp, oldp and p seems a bit more complicated than it needs to be, but it might work fine.
It's best to use debugger to step through this and see if it works the way you planned.

Ooh what a foolish thing of me. I allocated the space for the pointer and the text block but not the memory for the chars inside the text block. Its all fixed now and i can continue to solve the problem now that this is resolved. Many thanks!

diredragon · May 23, 2017

SlowThinker said:
Does it compile? What does it do when you run it? Can you step through the program? In Visual Studio, go to menu, Debug, Start Debugging, then use keys F10 and F11 to step through the code and see if it works as expected. When you hover mouse over a variable, it will show its current value.
C:
            if (c != ' ') {
                newp = calloc(1, sizeof(Words));    //Here since my char is of different type and i'm
                newp->text[0] = c;                  //at the beggining of the new text block i just allocate and add that char
                i = 1;                              //to the first place.
                state = next;
                continue;
I would feel better if you could change the first line of this a bit, but the important thing is that you still need to allocate memory for the text:
C:
            if (next != PS_START) {
                newp = calloc(1, sizeof(Words));
                newp->text=calloc(1, 30); //This line is super important. 30=maximum length of a word
                newp->text[0] = c;
                i = 1;
                state = next;
                continue;
As I said, there are various ways to deal with the hard-coded limit of 30, but you must allocate the memory somewhere.

The juggling you perform with newp, oldp and p seems a bit more complicated than it needs to be, but it might work fine.
It's best to use debugger to step through this and see if it works the way you planned.

C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef enum { PS_START, PS_LETTERS, PS_NUMBERSS } ParserState;

typedef struct raw {
    char *text;
    struct raw *nextp, *befp;
} Raw;

ParserState classify(char c) {
    if (c == ' ')
        return PS_START;
    if ((c >= 'A') && (c <= 'Z'))
        return PS_LETTERS;
    else
        return PS_NUMBERSS;
}Raw* readraw(FILE *document, Raw *p) {
    ParserState state = PS_START, next;
    char c; int i = 1, j = 0;
    Raw *newp, *oldp;

    while ((c = fgetc(document)) != EOF) {
        next = classify(c);
        if (state != next) {
            if (state != PS_START) {
                newp->text = realloc(newp->text, (2 * strlen(newp->text)) * sizeof(char));
                newp->text[i] = '\0';
                if (p == NULL) {
                    p = newp;
                }
                else {
                    oldp->nextp = newp;
                    oldp->befp = oldp;   //This part is added for double linked lists
                }
                oldp = newp;
            }
            if (c != ' ') {
                newp = calloc(1, sizeof(Raw));
                newp->text = realloc(newp->text, 1 * sizeof(char));
                newp->text[0] = c;
                i = 1;
                state = next;
                continue;
            }
            else {
                state = next;
                continue;
            }
        }
        newp->text = realloc(newp->text, (2 * strlen(newp->text)) * sizeof(char));
        newp->text[i] = c;
        i++;
    }
    return p;
}

Could you check if this part of the code that i did for the single linked list is ok for the double linked list when the commented part is added. I'm suppose to use the double linked lists and i tryed adding a pointer *befp to the structure RAW (Which was Words in our previous code). It runs fine but could be wrong. Does it use double links?

SlowThinker · May 23, 2017

diredragon said:

Could you check if this part of the code that i did for the single linked list is ok for the double linked list when the commented part is added. I'm suppose to use the double linked lists and i tryed adding a pointer *befp to the structure RAW (Which was Words in our previous code). It runs fine but could be wrong. Does it use double links?

Obviously it should be

C:

                   newp->befp = oldp;   //This part is added for double linked lists

Also you may want to classify both small and capital letters as letters.

Last, the realloc at the end would work just fine with 2+strlen instead of 2*strlen.
The doubling is used when you keep track of the actual size, and only realloc when needed. You'd have to remember how much you allocated, though.

C:

        int newp_size=1; //also set to 1 when allocating new 'newp->text'
        ...
        if (i+2>newp_size) { //reserve space for at least 1 character and \0 so if i==10 we need at least size==12
          newp_size*=2;
          newp->text = realloc(newp->text, newp_size * sizeof(char));
         }

This functionality is starting to obscure the logic of the program, so you may want to keep it the way you have.

This is what libraries are for. You could have e.g.

C:

struct
 {
  char *text;
  int i;
  int text_size;
 } BetterString;

and various functions to work with it. However, most libraries are written for C++, where the code can be better organized.

Assistance needed writing a C program

Homework Statement

1. What is a C program?

2. Why would I need assistance in writing a C program?

3. What is the process for writing a C program?

4. Is it necessary to have prior programming experience to write a C program?

5. How can I get assistance with writing a C program?

Similar threads

Hot Threads

Recent Insights