• Support PF! Buy your school textbooks, materials and every day products Here!

Creating a Word Index (matlab)

  • Thread starter gfd43tg
  • Start date
  • #1
gfd43tg
Gold Member
953
49

Homework Statement


I am working on Problem #2 in the attached PDF


Homework Equations





The Attempt at a Solution


When I run this right now, I get an error in my first conditional for if the word is not in the index. How do I say that the word does not exist in the index? Any other problems?

Code:
function Index = InsertDoc(Index, newDoc, DocNum)
IndexWords = {Index.Word};
for i = 1:numel(newDoc)
   % If word is not in the Index
    if isempty(Index) || Index(i).Words ~= IndexWords
       Index(numel(IndexWords)+1).Word = newDoc{i};
       Index(numel(IndexWords)+1).Documents = DocNum;
       Index(numel(IndexWords)+1).Locations{end+1} = i;
    else
    end
    % If the word does exist in the Index, but the occurance is unknown
    % 1st occurance
    if Index(i).Documents == DocNum
        WordInIndex = strcmpi(newDoc{i},{Index.Word});
        Index(WordInIndex).Documents = [Index(WordInIndex).Documents,DocNum];
        Index(WordInIndex).Locations{end}(end+1)=i
      % 2nd occurance or later
    else
        Index(WordInIndex.Locations{numel(Index(WordInIndex).Locations)+1}) = i;
    end
end
 

Attachments

Answers and Replies

  • #2
402
120
What is the error that you get?
What does that error tell you about what is going on in the code?
Think about if you were tasked to do this by hand how would you go about accomplishing the task? and then how would you translate that into code?
 
  • #3
gfd43tg
Gold Member
953
49
Yes, here is the error I get

Code:
Error in InsertDoc (line 4)
    if isempty(Index) || Index(i).Words ~= DocNum
I believe something is wrong with the second part in particular. How do I say that the word does not exist in the Index?
 
  • #4
402
120
I'm not super familiar with Matlab code so take this with a grain of salt.
It looks like you're exceeding the bounds of Index which would cause the error.

As far as how do you say? well you need to figure it out properly, which I dont think you're doing atm.
What it looks like your check is doing is saying:
If the index is empty OR the current spot I'm checking isn't the word then go ahead and add it.

Think about this. Say you have a set of trading cards and your friend gives a new one. How would you go through your cards to see if you have that one already or not
 
  • #5
gfd43tg
Gold Member
953
49
I'm trying this problem completely from scratch, so I am trying to test and debug only the first case

Here is where I am now
This is my code to create a cell structure that is an empty 1x0 cell array. I want this loop to look at every element of Doc1, and if Index is empty (should be for i=1), or if the word Doc1(i) is not inside of Index, then to append it to the cell array. I am having trouble getting a conditional statement that does exactly that.

In my test, Index should contain 'I' 'love' 'Matlab'.
Code:
function Index = InitializeIndex()
c10 = cell(1,0);
Index = struct('Word', c10, 'Documents', c10, 'Locations', c10);
Here is my input
Code:
Doc1 = {'Matlab','is','awesome'};
E7 = InitializeIndex;
E7 = InsertDoc(E7,Doc1,1);
Code:
function Index = InsertDoc(Index, newDoc, DocNum)
% This function will be a struct array where each element corresponds to a
% unique word in a group of documents. In each element of the struct array
% the word is stored in the Word field, the document numbers that the word
% is contained is in the documents field, and the locations of the word in
% each document is in the Location field.
Index = {Index.Word};
for i = 1:numel(newDoc)
    % IndexWord is either empty or the word is not present in IndexWord
    if isempty(Index) || strcmpi(Index{i},newDoc(i))
        Index{end+1} = newDoc(i); 
        Index(i).Documents = DocNum(i);
    end
end
 
Last edited:
  • #6
AlephZero
Science Advisor
Homework Helper
6,994
291
Code:
function Index = InsertDoc(Index, newDoc, DocNum)
Using the name (Index) for an input argument and the output argument doesn't seem like a good plan. I can't find anything in the Matlab documentation that says whether it is legal or not, but even if it is legal, it's confusing.
 
  • #7
gfd43tg
Gold Member
953
49
Hey AZ. This is the directions given in the problem. The purpose of the problem is to modify the input Index. We are supposed to Modify Index by adding in words until all unique words are contained in the output.
 
  • #8
AlephZero
Science Advisor
Homework Helper
6,994
291
There is no problem calling a function with something like Index = InsertDoc(Index, newDoc, DocNum). The function takes the value of Index, computes something, and then overwrites Index with it.

But if you look at all the simple tutorial examples of how to write a function, they don't use the same variable name inside the definition of he function.

If changing one of the names doesn't help, sorry, I don't use Matlab much these days so I'm not an expert!
 
  • #9
gfd43tg
Gold Member
953
49
Actually the bigger issue was that I initialzied Index with the first function, then overwrote it with Index = {Index.Word}. I deleted that part, but I still can't get my conditional statement right.

Edit: I realize what you are telling me. This is a problem, because you have to have the output (same name as input) in the definition of the function everytime. So it seems to be contradictory in a way to what you were saying. To say that you can't have the input name in the definition of the function, however the output has to be in the function, and the output name is the same as the input name.

I don't have a choice in the name, because I have to follow the template given.
 
Last edited:
  • #10
gfd43tg
Gold Member
953
49
Here is where I am at now

Code:
function Index = InsertDoc(Index, newDoc, DocNum)
% This function will be a struct array where each element corresponds to a
% unique word in a group of documents. In each element of the struct array
% the word is stored in the Word field, the document numbers that the word
% is contained is in the documents field, and the locations of the word in
% each document is in the Location field.
for i = 1:numel(newDoc)
    % IndexWord is either empty or the word is not present in IndexWord
    if isempty(Index)|| strcmpi({Index.Word},newDoc{i})
        Index(end + 1).Word = newDoc{i}; 
    end
end
This is the input I am running
Code:
Doc1 = {'Matlab','is','awesome'};
E7 = InitializeIndex;
E7 = InsertDoc(E7,Doc1,1);
and here is my output, which was not expected. I was expecting E7(2) to be 'is', not a matrix dimension error.

Code:
E7(1)

ans = 

         Word: 'Matlab'
    Documents: []
    Locations: []

EDU>> E7(2)
Index exceeds matrix dimensions.

[
 

Related Threads on Creating a Word Index (matlab)

  • Last Post
Replies
1
Views
1K
  • Last Post
Replies
6
Views
4K
Replies
2
Views
4K
Replies
4
Views
2K
Replies
6
Views
7K
  • Last Post
Replies
1
Views
993
Replies
6
Views
3K
Replies
0
Views
1K
Replies
1
Views
6K
Replies
1
Views
1K
Top