1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

I need to write a function that reads words from a text file and

  1. Apr 15, 2007 #1
    I need to write a function that reads words from a text file and writes the words to an output tex file while removing all duplicate words.



    void remove_dup(string filename)

    {
    vector <string> a;
    string word;
    ifstream infile(filename.c_str());
    if(infile.fail())
    exit(0);
    while(!infile.eof(0))
    {
    infile >> word;

    I'm stuck.

    Should I use if statement?

    if( a.at(i) == a.at(i) )
    a.erase(i,i); ?
    else

    {
     
  2. jcsd
  3. Apr 19, 2007 #2

    mezarashi

    User Avatar
    Homework Helper

    Dealing with text is always a complicated task in programming especially for beginners. How are the words formatted in the text file? Are they separated by a delimiter of sorts (e.g. comma, semi-colon, etc)? Or is it one word per line?

    So you need to make sure that you are indeed getting your words into the vector.

    A straight-forward algorithm would be, using my own rough pseudo-code:

    Code (Text):

    WhileNot (EndofFile)
      TempWord = GetNextWordFromTextFile()
      ForEach (Word in Vector)
        If TempWord == Word[x] in Vector
          DuplicateWordTest = true
          BreakForLoop
        EndIf
      EndFor

      If (DuplicateWordTest = false)
        Vector.AddWord(TempWord)
      EndIf
    EndWhile
     
    Now you will have an array that is free from any duplicates.
     
  4. Apr 19, 2007 #3

    -Job-

    User Avatar
    Science Advisor

    Instead of a vector you should use a Hashtable, or dictionary, this will give better performance because for each word you don't have to check every previous word. For example, in C#:
    Code (Text):

    string[] words = text.Split(new char[]{ " "});
    Dictionary<string, bool> index = new Dictionary<string, bool>(words.Length);

    foreach(string word in words){
        if(index.ContainsKey(word)) Console.Write(" " + word);
        else index[word] = true;
    }
     
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?