# Creating a Word Index (matlab)

Gold Member

## Homework Statement

I am working on Problem #2 in the attached PDF

## The Attempt at a Solution

When I run this right now, I get an error in my first conditional for if the word is not in the index. How do I say that the word does not exist in the index? Any other problems?

Code:
function Index = InsertDoc(Index, newDoc, DocNum)
IndexWords = {Index.Word};
for i = 1:numel(newDoc)
% If word is not in the Index
if isempty(Index) || Index(i).Words ~= IndexWords
Index(numel(IndexWords)+1).Word = newDoc{i};
Index(numel(IndexWords)+1).Documents = DocNum;
Index(numel(IndexWords)+1).Locations{end+1} = i;
else
end
% If the word does exist in the Index, but the occurance is unknown
% 1st occurance
if Index(i).Documents == DocNum
WordInIndex = strcmpi(newDoc{i},{Index.Word});
Index(WordInIndex).Documents = [Index(WordInIndex).Documents,DocNum];
Index(WordInIndex).Locations{end}(end+1)=i
% 2nd occurance or later
else
Index(WordInIndex.Locations{numel(Index(WordInIndex).Locations)+1}) = i;
end
end

#### Attachments

• Word Index.pdf
59.1 KB · Views: 206

What is the error that you get?
What does that error tell you about what is going on in the code?
Think about if you were tasked to do this by hand how would you go about accomplishing the task? and then how would you translate that into code?

Gold Member
Yes, here is the error I get

Code:
Error in InsertDoc (line 4)
if isempty(Index) || Index(i).Words ~= DocNum

I believe something is wrong with the second part in particular. How do I say that the word does not exist in the Index?

I'm not super familiar with Matlab code so take this with a grain of salt.
It looks like you're exceeding the bounds of Index which would cause the error.

As far as how do you say? well you need to figure it out properly, which I dont think you're doing atm.
What it looks like your check is doing is saying:
If the index is empty OR the current spot I'm checking isn't the word then go ahead and add it.

Gold Member
I'm trying this problem completely from scratch, so I am trying to test and debug only the first case

Here is where I am now
This is my code to create a cell structure that is an empty 1x0 cell array. I want this loop to look at every element of Doc1, and if Index is empty (should be for i=1), or if the word Doc1(i) is not inside of Index, then to append it to the cell array. I am having trouble getting a conditional statement that does exactly that.

In my test, Index should contain 'I' 'love' 'Matlab'.
Code:
function Index = InitializeIndex()
c10 = cell(1,0);
Index = struct('Word', c10, 'Documents', c10, 'Locations', c10);

Here is my input
Code:
Doc1 = {'Matlab','is','awesome'};
E7 = InitializeIndex;
E7 = InsertDoc(E7,Doc1,1);

Code:
function Index = InsertDoc(Index, newDoc, DocNum)
% This function will be a struct array where each element corresponds to a
% unique word in a group of documents. In each element of the struct array
% the word is stored in the Word field, the document numbers that the word
% is contained is in the documents field, and the locations of the word in
% each document is in the Location field.
Index = {Index.Word};
for i = 1:numel(newDoc)
% IndexWord is either empty or the word is not present in IndexWord
if isempty(Index) || strcmpi(Index{i},newDoc(i))
Index{end+1} = newDoc(i);
Index(i).Documents = DocNum(i);
end
end

Last edited:
AlephZero
Homework Helper
Code:
function Index = InsertDoc(Index, newDoc, DocNum)

Using the name (Index) for an input argument and the output argument doesn't seem like a good plan. I can't find anything in the Matlab documentation that says whether it is legal or not, but even if it is legal, it's confusing.

Gold Member
Hey AZ. This is the directions given in the problem. The purpose of the problem is to modify the input Index. We are supposed to Modify Index by adding in words until all unique words are contained in the output.

AlephZero
Homework Helper
There is no problem calling a function with something like Index = InsertDoc(Index, newDoc, DocNum). The function takes the value of Index, computes something, and then overwrites Index with it.

But if you look at all the simple tutorial examples of how to write a function, they don't use the same variable name inside the definition of he function.

If changing one of the names doesn't help, sorry, I don't use Matlab much these days so I'm not an expert!

Gold Member
Actually the bigger issue was that I initialzied Index with the first function, then overwrote it with Index = {Index.Word}. I deleted that part, but I still can't get my conditional statement right.

Edit: I realize what you are telling me. This is a problem, because you have to have the output (same name as input) in the definition of the function everytime. So it seems to be contradictory in a way to what you were saying. To say that you can't have the input name in the definition of the function, however the output has to be in the function, and the output name is the same as the input name.

I don't have a choice in the name, because I have to follow the template given.

Last edited:
Gold Member
Here is where I am at now

Code:
function Index = InsertDoc(Index, newDoc, DocNum)
% This function will be a struct array where each element corresponds to a
% unique word in a group of documents. In each element of the struct array
% the word is stored in the Word field, the document numbers that the word
% is contained is in the documents field, and the locations of the word in
% each document is in the Location field.
for i = 1:numel(newDoc)
% IndexWord is either empty or the word is not present in IndexWord
if isempty(Index)|| strcmpi({Index.Word},newDoc{i})
Index(end + 1).Word = newDoc{i};
end
end

This is the input I am running
Code:
Doc1 = {'Matlab','is','awesome'};
E7 = InitializeIndex;
E7 = InsertDoc(E7,Doc1,1);

and here is my output, which was not expected. I was expecting E7(2) to be 'is', not a matrix dimension error.

Code:
E7(1)

ans =

Word: 'Matlab'
Documents: []
Locations: []

EDU>> E7(2)
Index exceeds matrix dimensions.

[