Dynamic Programming - Restoring white space between words in a file

Click For Summary
The discussion focuses on developing a C++ program using dynamic programming to restore white spaces in a text file where they have been lost. The program should generate all possible original documents from a given input while ranking them based on the popularity of the words used, with a dictionary provided that categorizes words by popularity. Key considerations include using an efficient data structure for the dictionary, handling punctuation correctly, and excluding non-English letters and most single-letter words to reduce potential solutions. The logic involves recursively exploring word combinations from the text, backtracking when necessary to find valid configurations. The goal is to output multiple text files representing possible original documents and a ranking file based on likelihood.
MattS134
Messages
2
Reaction score
0
Poster warned about not using the homework template
All the white space among words in a text file was lost. Write a C++ program which using dynamic programming to get all of the possible original text files (i.e. with white spaces between words) and rank them in order of likelihood with the best possible runtime.

You have a text file of dictionary words and the popularity class of the word (words are listed from popularity 1-100 (being most popular words), 101-200, etc)

- Input is a text file

- Output should be a series of files, each containing one the possible original documents and one document containing the ranking of likelihood of the outputs

- Use the right data structure for storing the data structure in memory, as this will impact the runtime

- Only dictionary words will be in the input text file

- Punctuation such as: . ? ! ; should be handled by program

- Remove words with non-English letters from dictionary

- Remove all the single letter words except for 'I' , 'A', and 'a' from dictionary or handle them in another way. (Important so there aren't infinite possible solutions)

To Test:

- Create text files using words in the dictionary. The ranking of outputs is in part based on the popularity class of the words used

I know how to use filestreams to store information from the file, but I'm not sure what data structure to use to store the dictionary for max efficiency
Also not sure of the logic of generating multiple output text files and how to create all possible original documents
 
Physics news on Phys.org
You have a tree-walk. In each case, you will look at the start of the text and look for all possible first words. The skip past that word and repeat the process. Whenever you get to the end of the file or punctuation, you have success. So back up to the previous word and look for another. Whenever you get to a character string that cannot form a word, that string has failed to generate a solution. So, once again, back up to the previous word and look for another.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 6 ·
Replies
6
Views
4K
Replies
1
Views
2K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 3 ·
Replies
3
Views
13K
  • · Replies 3 ·
Replies
3
Views
5K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 3 ·
Replies
3
Views
2K