Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Need help with basic java

  1. May 7, 2005 #1
    hello everyone, thx for taking ur time reading this po
    i need help with something to do with java, designing an application to read sentences & words from a txt file, and create a new txt file, with all the words (in single) from the original txt file, next to each word, containing the frequency of how often each word appeared in the txt file and the number of sentences appeared. Assuming each sentence is paused with a full-stop. commas, question-marks, etc can be ignored.

    Input file:
    This is a simple simple example test. Another test.

    Output file:
    this 1 1
    is 1 1
    a 1 1
    simple 2 1
    test 2 2
    example 1 1
    another 1 1

    so far i only have written my script as....

    import java.io.*;
    import java.util.*;

    public class Analysis {

    public static void main(String args[]) throws IOException {

    File inputFile = null;
    File outputFile = null;

    inputFile = new File("Analysis_output.txt");
    outputFile = new File("Analysis_source.txt");

    FileReader in = new FileReader(inputFile);
    FileWriter out = new FileWriter(outputFile);

    int c;


    i'm not sure how to convert sentences into arraylists (use charAt, seeking for fullstop?)
    converting each words from a sentence into a sub-arraylist (use charAt, seeking for spaces in between?)

    and the original txt file contains more than 1 line...
    not all sentences stuffed into 1 line

  2. jcsd
  3. May 7, 2005 #2
    Use the StringTokenizer class.
  4. May 10, 2005 #3
    From the http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html [Broken]:
    Last edited by a moderator: May 2, 2017
  5. May 10, 2005 #4
    I've been using StringTokenizer myself with jsdk1.5 and it works fine. You just initialize it with the String and with the separator and the class does the job for you.
    As suggestion use ". " as separator for sentences and " " for words. Also to improve your functionality during the separation of words, if the word contains a punctuation mark such as "." "," "?" etc. in the end, remove it.
    Last edited by a moderator: May 2, 2017
  6. May 11, 2005 #5
    Yes, it will work fine. The text I quoted implies that clearly. I'm only pointing out that Sun would rather have you use split or regex. :smile:
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook