Java Java Text Analysis Script: Count Word Frequency & Sentences | Get Help Now"

  • Thread starter Thread starter Franco
  • Start date Start date
  • Tags Tags
    Java
Click For Summary
The discussion centers on creating a Java application that reads sentences and words from a text file, counts the frequency of each word, and outputs this information into a new text file. The user has provided an example input and expected output format, illustrating the need for word frequency and sentence count. Initial code has been shared, but the user is unsure how to process sentences into ArrayLists and handle multiple lines in the input file.Participants suggest using the StringTokenizer class for parsing sentences and words, although it is noted that this class is considered legacy and its use is discouraged in favor of the split method or the java.util.regex package. Recommendations include using ". " as a sentence separator and " " for words, along with advice to remove punctuation marks from the end of words to ensure accurate counting. Overall, the conversation emphasizes practical coding solutions while highlighting best practices in Java programming.
Franco
Messages
12
Reaction score
0
hello everyone, thanks for taking ur time reading this po
i need help with something to do with java, designing an application to read sentences & words from a txt file, and create a new txt file, with all the words (in single) from the original txt file, next to each word, containing the frequency of how often each word appeared in the txt file and the number of sentences appeared. Assuming each sentence is paused with a full-stop. commas, question-marks, etc can be ignored.

Example:
Input file:
This is a simple simple example test. Another test.

Output file:
this 1 1
is 1 1
a 1 1
simple 2 1
test 2 2
example 1 1
another 1 1


so far i only have written my script as...


import java.io.*;
import java.util.*;

public class Analysis {

public static void main(String args[]) throws IOException {

File inputFile = null;
File outputFile = null;


inputFile = new File("Analysis_output.txt");
outputFile = new File("Analysis_source.txt");

FileReader in = new FileReader(inputFile);
FileWriter out = new FileWriter(outputFile);

int c;




in.close();
out.close();
}
}



i'm not sure how to convert sentences into arraylists (use charAt, seeking for fullstop?)
converting each words from a sentence into a sub-arraylist (use charAt, seeking for spaces in between?)

and the original txt file contains more than 1 line...
not all sentences stuffed into 1 line



THX FOR READING
 
Technology news on Phys.org
Use the StringTokenizer class.
 
so-crates said:
Use the StringTokenizer class.
From the http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html :
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

:smile:
 
Last edited by a moderator:
abhishek said:
From the http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html :


:smile:

I've been using StringTokenizer myself with jsdk1.5 and it works fine. You just initialize it with the String and with the separator and the class does the job for you.
As suggestion use ". " as separator for sentences and " " for words. Also to improve your functionality during the separation of words, if the word contains a punctuation mark such as "." "," "?" etc. in the end, remove it.
 
Last edited by a moderator:
ramollari said:
I've been using StringTokenizer myself with jsdk1.5 and it works fine. You just initialize it with the String and with the separator and the class does the job for you.


Yes, it will work fine. The text I quoted implies that clearly. I'm only pointing out that Sun would rather have you use split or regex. :smile:
 
Learn If you want to write code for Python Machine learning, AI Statistics/data analysis Scientific research Web application servers Some microcontrollers JavaScript/Node JS/TypeScript Web sites Web application servers C# Games (Unity) Consumer applications (Windows) Business applications C++ Games (Unreal Engine) Operating systems, device drivers Microcontrollers/embedded systems Consumer applications (Linux) Some more tips: Do not learn C++ (or any other dialect of C) as a...

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
15K
  • · Replies 5 ·
Replies
5
Views
9K