1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Assistance needed writing a C program

  1. May 21, 2017 #1

    diredragon

    User Avatar
    Gold Member

    1. The problem statement, all variables and given/known data
    My task is to write a program that reads from the .txt file into a list which contains two parts, the part of the text from the file and a pointer to the next element. Here are the specifications of the problem:
    When reading from the file you should separate the letters from the numbers in your list (Space is a default separator). For example:
    If your .txt file contains the following: I am 39 years old and my username is 7yghh67f[]\.
    Your code should store: (comas here represent the next list element)
    I, am, 39, years, old, and, my, username, is, 7, yghh, 67, f, []\

    2. Relevant equations
    3. The attempt at a solution

    I hope you understand what i need help with. So far i have written an algorithm to separate the .txt file text into blocks of words not including spaces.
    Code (C):

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>


    typedef struct words {
        char *text;
        struct words *nextp;
    } Words;



    Words* read(FILE *document, Words *p) {
        int i = 0, j;
        char c;
        Words *new = p;

        new = calloc(1, sizeof(Words));
        while ((c = fgetc(document)) != EOF) {
            if (c == ' ') {
                new->text[i] = '\0';
                i = 0;
                new = calloc(1, sizeof(Words));
                new = new->nextp;
            }
            else {
                new->text[i] = c;
                new->nextp = NULL;
                i++;
            }
        }
        return p;
    }

    void print(Words *p) {
        while (p) {
            puts(p->text);
            p = p->nextp;
        }
    }

    int main(int barg, const char *varg[]) {

        FILE *document = NULL;
        Words *headt = NULL;

        document = fopen(varg[1], "r");

        headt = read(document, headt);
        print(headt);

        fclose(document);

    }
     
    I named the .txt document . I do not know why i had to use varg[1] instead of the document name but they said i should. When i run this in visual studio it gives me some kind of unresolved external symbol error. What do you think of it so far? Is it a good idea now to create a new structure in which i will write the second part of the problem (the separation of symbols and numbers from letters)? Or could have i done this immediately?
     
  2. jcsd
  3. May 21, 2017 #2

    Mark44

    Staff: Mentor

    Because the intent is for you to give the filename on the command line when you run your program, something like this:
    Code (Text):
    c:\myprog document.txt
    What's the exact error message, including the line number where it occurs? This sounds like a linker error, so probably won't give a line number, but it should indicate what symbol it's having problems with.
     
  4. May 21, 2017 #3
    There are a few issues...
    1. It's unfortunate to call a variable "new" because it's a reserved word in C++. Try to find some other name.

    2. You allocate Words but its text member points to a NULL. It might be best to change its definition to
    Code (C):

    typedef struct words {
        char text[20];
        struct words *nextp;
    } Words;
     
    or, during parsing, make a temporary variable that is much longer, and use
    Code (C):
    new->text=strdup(temporary);
    once a word is finished. Of course you should prevent writing more characters into the variable than it can hold, in both cases.

    3. You should set new->next=NULL only once, right after it's allocated (possibly together with new->text=calloc(1, 20) ). Or don't do it at all, since calloc already returns a zeroed memory.

    4. The exchange where you go from old to new Words is backwards.
    Code (C):

                new->text[i] = '\0';
                i = 0;
                //wrong new = calloc(1, sizeof(Words));
                new->nextp = calloc(1, sizeof(Words)); //correct but again, check initialization
                new = new->nextp;
     
    5. Yes I would improve the parser to split the words from numbers. I like to write it along these lines:
    Code (C):

    typedef enum {PS_START, PS_LETTERS, PS_NUMBERS, PS_SYMBOLS} ParserState;
    ParserState classify(char c) {
      if (c==' ') return PS_START;
      if ((c>='0') && (c<=9)) return PS_NUMBERS;
      and so on
     }

    ...
    ParserState state=PS_START;
    while ((c=fgetc(document))!=EOF)
     {
      ParserState next=classify(c);
      if (state!=next)
        if (state!=PS_START)
         { /* finish the previous word */ }
      state=next;
     }
     
    It will need some work but the general structure should be OK.
     
  5. May 22, 2017 #4

    diredragon

    User Avatar
    Gold Member

    I implemented the things you said and sort of finished the code which in visual studio builds succesfully but cant run. Here is the code:
    Code (C):

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    typedef enum { PS_START, PS_LETTERS, PS_NUMBERS, PS_SYMBOLS } ParserState;

    typedef struct words {
        char *text;
        struct words *nextp;
    } Words;

    ParserState classify(char c) {
        if (c == ' ')
            return PS_START;
        if ((c >= 'a') && (c <= 'z'))
            return PS_LETTERS;
        if ((c >= '0') && (c <= '9'))
            return PS_NUMBERS;
        else
            return PS_SYMBOLS;
    }


    Words* read(FILE *document, Words *p) {
        ParserState state = PS_START, next;
        char c; int i = 1, j = 0;
        Words *newp, *oldp;

        while ((c = fgetc(document)) != EOF) {
            next = classify(c);
            if (state != next) {
                if (state != PS_START) {
                    newp->text[i] = '\0';
                    newp->nextp = NULL;                         //This part says to me that i have a different char but
                    if (p == NULL) {                            //that im not at the beginning of the new list so i need to finish with it
                        p = newp;
                    }
                    else
                        oldp->nextp = newp;
                    oldp = newp;
                }                                      
                if (c != ' ') {
                    newp = calloc(1, sizeof(Words));    //Here since my char is of different type and i'm
                    newp->text[0] = c;                  //at the beggining of the new text block i just allocate and add that char
                    i = 1;                              //to the first place.
                    state = next;
                    continue;
                }
                else {
                    state = next;
                    continue;
                }
            }                                        
            //at this point i know my char is of the same type and i just add it to the rest of them char's
            newp->text[i] = c;
            i++;
        }
        return p;
    }

    void print(Words *p) {
        while (p) {
            puts(p->text);
            p = p->nextp;
        }
    }

    int main(int barg, const char *varg[]) {

        FILE *document = NULL;
        Words *head = NULL;
        int t;

        document = fopen(varg[1], "r");

        head = read(document, head);
        printf("Enter something\n");
        scanf("%d\n", t);
        print(head);

        fclose(document);

    }
     
    I annotated the important parts and i dont get why it doesnt run. It seems to work fine when i go through it by example on pen and paper. Maybe there's something wrong when a move pointers to the next element? Or something with the reasoning i used?
     
  6. May 22, 2017 #5
    Does it compile? What does it do when you run it? Can you step through the program? In Visual Studio, go to menu, Debug, Start Debugging, then use keys F10 and F11 to step through the code and see if it works as expected. When you hover mouse over a variable, it will show its current value.
    Code (C):
                if (c != ' ') {
                    newp = calloc(1, sizeof(Words));    //Here since my char is of different type and i'm
                    newp->text[0] = c;                  //at the beggining of the new text block i just allocate and add that char
                    i = 1;                              //to the first place.
                    state = next;
                    continue;
    I would feel better if you could change the first line of this a bit, but the important thing is that you still need to allocate memory for the text:
    Code (C):
                if (next != PS_START) {
                    newp = calloc(1, sizeof(Words));
                    newp->text=calloc(1, 30); //This line is super important. 30=maximum length of a word
                    newp->text[0] = c;
                    i = 1;
                    state = next;
                    continue;
    As I said, there are various ways to deal with the hard-coded limit of 30, but you must allocate the memory somewhere.

    The juggling you perform with newp, oldp and p seems a bit more complicated than it needs to be, but it might work fine.
    It's best to use debugger to step through this and see if it works the way you planned.
     
  7. May 22, 2017 #6

    diredragon

    User Avatar
    Gold Member

    Ooh what a foolish thing of me. I allocated the space for the pointer and the text block but not the memory for the chars inside the text block. Its all fixed now and i can continue to solve the problem now that this is resolved. Many thanks!
     
  8. May 23, 2017 #7

    diredragon

    User Avatar
    Gold Member

    Code (C):

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    typedef enum { PS_START, PS_LETTERS, PS_NUMBERSS } ParserState;

    typedef struct raw {
        char *text;
        struct raw *nextp, *befp;
    } Raw;

    ParserState classify(char c) {
        if (c == ' ')
            return PS_START;
        if ((c >= 'A') && (c <= 'Z'))
            return PS_LETTERS;
        else
            return PS_NUMBERSS;
    }


    Raw* readraw(FILE *document, Raw *p) {
        ParserState state = PS_START, next;
        char c; int i = 1, j = 0;
        Raw *newp, *oldp;

        while ((c = fgetc(document)) != EOF) {
            next = classify(c);
            if (state != next) {
                if (state != PS_START) {
                    newp->text = realloc(newp->text, (2 * strlen(newp->text)) * sizeof(char));
                    newp->text[i] = '\0';
                    if (p == NULL) {
                        p = newp;
                    }
                    else {
                        oldp->nextp = newp;
                        oldp->befp = oldp;   //This part is added for double linked lists
                    }
                    oldp = newp;
                }
                if (c != ' ') {
                    newp = calloc(1, sizeof(Raw));
                    newp->text = realloc(newp->text, 1 * sizeof(char));
                    newp->text[0] = c;
                    i = 1;
                    state = next;
                    continue;
                }
                else {
                    state = next;
                    continue;
                }
            }
            newp->text = realloc(newp->text, (2 * strlen(newp->text)) * sizeof(char));
            newp->text[i] = c;
            i++;
        }
        return p;
    }
     
    Could you check if this part of the code that i did for the single linked list is ok for the double linked list when the commented part is added. I'm suppose to use the double linked lists and i tryed adding a pointer *befp to the structure RAW (Which was Words in our previous code). It runs fine but could be wrong. Does it use double links?
     
  9. May 23, 2017 #8
    Obviously it should be
    Code (C):
                       newp->befp = oldp;   //This part is added for double linked lists
    Also you may want to classify both small and capital letters as letters.

    Last, the realloc at the end would work just fine with 2+strlen instead of 2*strlen.
    The doubling is used when you keep track of the actual size, and only realloc when needed. You'd have to remember how much you allocated, though.
    Code (C):

            int newp_size=1; //also set to 1 when allocating new 'newp->text'
            ...
            if (i+2>newp_size) { //reserve space for at least 1 character and \0 so if i==10 we need at least size==12
              newp_size*=2;
              newp->text = realloc(newp->text, newp_size * sizeof(char));
             }
     
    This functionality is starting to obscure the logic of the program, so you may want to keep it the way you have.

    This is what libraries are for. You could have e.g.
    Code (C):

    struct
     {
      char *text;
      int i;
      int text_size;
     } BetterString;
     
    and various functions to work with it. However, most libraries are written for C++, where the code can be better organized.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Assistance needed writing a C program
  1. C program help needed (Replies: 3)

Loading...