Question about Applications 2 in Leon

  • Context: Undergrad 
  • Thread starter Thread starter schaefera
  • Start date Start date
  • Tags Tags
    Applications
Click For Summary

Discussion Overview

The discussion revolves around a potential error in an example from Leon's "Linear Algebra with Applications," specifically regarding the representation of documents and search words in a matrix format. Participants explore the implications of matrix dimensions and vector representations in the context of searching databases.

Discussion Character

  • Debate/contested
  • Technical explanation

Main Points Raised

  • One participant questions whether Leon has mixed up the dimensions of the matrix, suggesting that with m documents and n search words, the matrix should be nxm rather than mxn.
  • Another participant argues that the original description is correct, stating that there are m records (documents) and n attributes (search words), leading to an mxn matrix.
  • A participant explains that to find the number of successful search terms, one can use matrix multiplication with a column vector that indicates which search words are being queried.
  • There is a suggestion that individual searches can be conducted using separate search vectors, each with a single non-zero entry, or by using a combined vector to flag all terms of interest.

Areas of Agreement / Disagreement

Participants express differing views on the correct matrix representation and the appropriate vector dimensions for searching. No consensus is reached regarding the potential error in Leon's example.

Contextual Notes

Participants discuss the implications of matrix dimensions and vector representations without resolving the underlying assumptions about the definitions of documents and search words.

schaefera
Messages
208
Reaction score
0
Hi all. I'm studying from Leon's Linear Algebra with Applications. I'm wondering if one of his examples has an error.

In Application 2, he's talking about searching databases. He says we should imagine a database of m documents and n possible search words. Then he says that this can be put into an mxn matrix in which each column represents a book-- the jth entry of the column is 1 if that book contains the word, 0 if it doesn't.

He then says a "search vector" lives in R^m, and it's a column vector whose jth entry is 1 if you are seraching for that word.

Does he have his m's and n's mixed up? It seems to me like if you have m documents and each column of the matrix represents one, you need an nxm matrix. Similarly, you can't search through n words by using a vector in R^m, correct? Does he mean you take an nxm matrix and multiply its transpose by the vector in R^n?
 
Physics news on Phys.org
Any ideas?!
 
I think he has it right: remember you have m records with n attributes so you have m rows (one row each record) and n columns (one column for each attribute).

Now let's say you have a column vector with some values set. If you want to find out how many words were successful, you simply add the number of positive hits you get and this is just a matrix multiplication with a column vector where if you want to search for n words, you simply set n entries in the column vector to 1 and the rest to zero and when you check the value of the multiplication, if you get > 0 then you got n hits.

This guy is basically checking it one at a time, so he sets one entry to 1 and the rest to zero and if you get a 1 in that entry it means the word was found and 0 means it wasn't.
 
But if you have n search words, how can that be embedded in a vector in R^m?
 
You either do n individual searches with n individual search vectors (with only 1 non-zero entry) or you can return the "number" of hits by flagging every term you are interested in as a 1 and keep everything else zero.

Imagine three terms: a normal search for the first term would be [1 0 0]^T. Do to three individual searches for each term you have [1 0 0]^T [0 1 0]^T [0 0 1]^T respectively. Do find out how many terms appear in each document by considering all terms, your vector will be [1 1 1]^T and if we have say 6 observations and we get the output [1 1 1 3 2 1]^T then it means that the fourth has all three, the fifth have 2 and the rest have only 1.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K