- #1
dominique_
- 7
- 1
Hi everyone,
I am a mathematics undergraduate and I'm currently doing an internship at the informatics department of a university. I am well and truly out of my depth. My supervisor has assigned me tasks which include Java (a language I'm having to quickly pick up, having only used python/R).
The first task is to write a program that takes as an input a set of words and outputs a set of random vector embeddings for it so that they can be fed into wordvectors.org.
These are the steps I think I have to do but I don't know how, I'm working on netbeans for Java but I'm wondering if I should use Eclipse .1. Opening a text file
2. Reading in a text file
3. Storing the words (i.e. strings) of the text file in an array.
4. Open another text file
5. Loop over the array of stored words.
6. For each word, output the word to the text file, along with a few random numbers.
7. Close the the text files.
Additionally, in the upcoming weeks I am to create a first word embedding prototype which he has detailed as
1) Download WordNet
2) Create a matrix graph from it and
3) do Singular value decomposition on it.
I am also to understand SWELL code for Java.
---> SWELL https://github.com/paramveerdhillon/swell
---> Paper on which this project is based http://www.pdhillon.com/dhillon15a.pdf
---> http://wordvectors.org/ (I'm required to download data and experiment with this...?)I am so sorry if this is vague or lazy but I've been researching academic papers and trying to absorb so much and I don't want to screw this opportunity up.
I'd be grateful for absolutely anything. I obtained this internship based on my linear algebra knowledge and concepts such as linear transformation, SVD, eigenvectors are obviously at play here and have made my research more digestible but I can't see how any of that can be implemented by me, how am I of use to this research team, gah!
Thank-you in advance
I am a mathematics undergraduate and I'm currently doing an internship at the informatics department of a university. I am well and truly out of my depth. My supervisor has assigned me tasks which include Java (a language I'm having to quickly pick up, having only used python/R).
The first task is to write a program that takes as an input a set of words and outputs a set of random vector embeddings for it so that they can be fed into wordvectors.org.
These are the steps I think I have to do but I don't know how, I'm working on netbeans for Java but I'm wondering if I should use Eclipse .1. Opening a text file
2. Reading in a text file
3. Storing the words (i.e. strings) of the text file in an array.
4. Open another text file
5. Loop over the array of stored words.
6. For each word, output the word to the text file, along with a few random numbers.
7. Close the the text files.
Additionally, in the upcoming weeks I am to create a first word embedding prototype which he has detailed as
1) Download WordNet
2) Create a matrix graph from it and
3) do Singular value decomposition on it.
I am also to understand SWELL code for Java.
---> SWELL https://github.com/paramveerdhillon/swell
---> Paper on which this project is based http://www.pdhillon.com/dhillon15a.pdf
---> http://wordvectors.org/ (I'm required to download data and experiment with this...?)I am so sorry if this is vague or lazy but I've been researching academic papers and trying to absorb so much and I don't want to screw this opportunity up.
I'd be grateful for absolutely anything. I obtained this internship based on my linear algebra knowledge and concepts such as linear transformation, SVD, eigenvectors are obviously at play here and have made my research more digestible but I can't see how any of that can be implemented by me, how am I of use to this research team, gah!
Thank-you in advance