MHB Lucene Indexing: Clearing Confusion

  • Thread starter Thread starter shivajikobardan
  • Start date Start date
  • Tags Tags
    Confusion
AI Thread Summary
The discussion centers on the indexing process in Lucene, specifically addressing confusion around the first and third steps as depicted in a provided diagram. The first step involves the creation of an IndexWriter, which is essential for routing documents into the index, handling their lexing and parsing. The third step's specifics are less clear, leading to questions about its function compared to the first step. Participants clarify that the diagram represents an architectural overview rather than a flowchart, indicating that the steps are part of the document incorporation process into the Lucene index. The IndexWriter's capabilities to add, update, and delete indexes are acknowledged, reinforcing its importance in the indexing workflow. The discussion highlights the need for clarity in understanding these steps, especially in the context of academic expectations.
shivajikobardan
Messages
637
Reaction score
54
https://lh5.googleusercontent.com/47guV-L3yY2ZevuNEwk1wC9t9rJjQw0bXNHug16ah2EQ2XyLTAzqrBZcDMEwzFSd1mR_jFDTOFyG1GVHT8p1G4tPPRkRtqtOcOGXTb3UrildRHMayRznHaFQD9RdCdjeuEvyM-FkvQ_U3GLBHGVgkFY
These are the steps of indexing in Lucene given in our syllabus-:
https://lh4.googleusercontent.com/Qliby1unxHTU0vAwycWZzy563XdxcwUT4UAyA6Xf1ydKQAwSfKwqexdDFFc0CBZb9kSSvRKXEFoyKQ4cYn9K2EgEDRnWTiYFCDlqZ4VCAw9CWvgvcI9cOJo055PCJhyFTBJckNhtLi-eAMwM7q8JoUU

https://lh3.googleusercontent.com/X9geIBuCERbSadCkshBekbjvl4GAqGvHppgCayGcOBvaIkpIX1Jy5jyFSmmp39ANIyg3cq0tYWrYTxl1RNlOUfbHFAcNy5CJLxfWGve6DpjeXXekNwTl3P64zQ1_6dojvdo4-Z8aTFn-EZ51CqOlLfE
I understand the second step clearly. But I don't understand the first and third step. It's not mentioned clearly in this figure imo. Can you clear my confusion? Plus the sources that I refer don't even mention it like this, they explain it differently. I'm not sure from where this is copied from.
What are we doing in first vs third step as written in that figure text?
Why was indexwriter created first and not used later? Because according to my information that I've collected, you can also use indexwriter to add/remove/update indexes.

I've a good feeling that all of this information is incorrect but this is what's written in my teacher's notes so I'm not 100% sure of it. And even if it's wrong, they'd expect us to write the same thing in exam, so I've to learn it.
 
Technology news on Phys.org
shivajikobardan said:
But I don't understand the first and third step.
Are you talking about the first 3 steps? The "build index" step and "IndexWriter object" step?

The top diagram is simply a architecture diagram, not a flow diagram. It's show the role in the search/index process that Lucene fills.

The 3 steps at the top, with the IndexWriter, represent the lower let portion of that architecture. This is where the documents are being incorporated in the Lucene index, so that they can alter be searched (that's the upper right portion of the diagram).

The IndexWriter is simply the mechanic to route documents into the index. The IndexWriter handles the lexing and parsing of the documents to prepare them for the actual index. And, yes, the IndexWriter can also add/update/delete indexes.

Let me know if you have other questions, I know this response is a bit late. So it may no longer be timely.
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top