Comp Sci Creating index vs adding document to index differences

  • Thread starter Thread starter shivajikobardan
  • Start date Start date
  • Tags Tags
    Index
AI Thread Summary
Creating an index in Lucene involves the initial setup where the structure for storing documents is established, while adding a document to the index refers to the process of inserting actual content into that pre-defined structure. The discussion highlights the steps of indexing, including word collection, analysis, and the formation of an inverted index. It emphasizes that although these processes may seem interconnected, they are distinct stages in the indexing workflow. Users are advised to focus on utilizing the API for practical applications rather than getting bogged down in the underlying mechanics. Understanding the difference between creating an index and adding documents is crucial for effective use of Lucene.
shivajikobardan
Messages
637
Reaction score
54
Homework Statement
Lucene indexing
Relevant Equations
none
yl9HczXxZegv04i7pfmBTYHJ9fKg1hgZs5YhCxiNwk4MuaW7fM.png

These are the steps of indexing in Lucene given in our syllabus-:
THSmRNHvZg5Nl4S1QLMbKjyOa5yolaAfGBIQNRDY6hhZ8VpfIQ.png

HnTlCNGxuYfFumBGzDVwfU_zxb9Ht5o6ZQKHuZLmN3tiJa04kY.png

The first step says that it is creating an index whereas the last step says that it's adding document to index.
What's the difference between these two? Can I get an example.

Here's what I think it should happen-:
1) Collect all words from each documents. Lists it like-;

doc1=>word1,word2,WORD3….wordn
doc2=>word1,WORD2,word3….wordn
And so on.

2) Analyse the words and remove various types of words as per analyzer, process them as per analyzer.

Say now what remains is-:
doc1=>word1,word3,...word(n-1)
doc2=>word2,...word(n-3)

3) Done. Now you can make inverted index as well by converting this to inverted index.

But it's done bit differently, which I'm not 100% clear about.
 
Physics news on Phys.org
shivajikobardan said:
whereas the last step says that it's adding document to index.
No it doesn't, what you are calling the "last" step simply creates a document; adding it to the index is another step.

shivajikobardan said:
What's the difference between these two? Can I get an example.
Can you get an example of the difference between creating a thing and adding something to that thing? Are you serious?

shivajikobardan said:
Here's what I think it should happen-:
...
This is all done behind the scenes, you don't have to worry about any of this to use Lucene, you just need to learn how to use the API. A good place to learn that is the API documentation itself: https://lucene.apache.org/core/9_2_0/core/index.html
 
  • Like
Likes jim mcnamara

Similar threads

Replies
8
Views
1K
Replies
3
Views
2K
Replies
4
Views
3K
Replies
2
Views
4K
Replies
1
Views
1K
Replies
3
Views
3K
2
Replies
67
Views
14K
Replies
3
Views
10K
Back
Top