Lucene Indexing: Clearing Confusion

shivajikobardan · Jul 29, 2022

https://lh5.googleusercontent.com/47guV-L3yY2ZevuNEwk1wC9t9rJjQw0bXNHug16ah2EQ2XyLTAzqrBZcDMEwzFSd1mR_jFDTOFyG1GVHT8p1G4tPPRkRtqtOcOGXTb3UrildRHMayRznHaFQD9RdCdjeuEvyM-FkvQ_U3GLBHGVgkFY
These are the steps of indexing in Lucene given in our syllabus-:
https://lh4.googleusercontent.com/Qliby1unxHTU0vAwycWZzy563XdxcwUT4UAyA6Xf1ydKQAwSfKwqexdDFFc0CBZb9kSSvRKXEFoyKQ4cYn9K2EgEDRnWTiYFCDlqZ4VCAw9CWvgvcI9cOJo055PCJhyFTBJckNhtLi-eAMwM7q8JoUU

https://lh3.googleusercontent.com/X9geIBuCERbSadCkshBekbjvl4GAqGvHppgCayGcOBvaIkpIX1Jy5jyFSmmp39ANIyg3cq0tYWrYTxl1RNlOUfbHFAcNy5CJLxfWGve6DpjeXXekNwTl3P64zQ1_6dojvdo4-Z8aTFn-EZ51CqOlLfE
I understand the second step clearly. But I don't understand the first and third step. It's not mentioned clearly in this figure imo. Can you clear my confusion? Plus the sources that I refer don't even mention it like this, they explain it differently. I'm not sure from where this is copied from.
What are we doing in first vs third step as written in that figure text?
Why was indexwriter created first and not used later? Because according to my information that I've collected, you can also use indexwriter to add/remove/update indexes.

I've a good feeling that all of this information is incorrect but this is what's written in my teacher's notes so I'm not 100% sure of it. And even if it's wrong, they'd expect us to write the same thing in exam, so I've to learn it.

whartung · Aug 13, 2022

shivajikobardan said:

But I don't understand the first and third step.

Are you talking about the first 3 steps? The "build index" step and "IndexWriter object" step?

The top diagram is simply a architecture diagram, not a flow diagram. It's show the role in the search/index process that Lucene fills.

The 3 steps at the top, with the IndexWriter, represent the lower let portion of that architecture. This is where the documents are being incorporated in the Lucene index, so that they can alter be searched (that's the upper right portion of the diagram).

The IndexWriter is simply the mechanic to route documents into the index. The IndexWriter handles the lexing and parsing of the documents to prepare them for the actual index. And, yes, the IndexWriter can also add/update/delete indexes.

Let me know if you have other questions, I know this response is a bit late. So it may no longer be timely.

Lucene Indexing: Clearing Confusion

SUMMARY

PREREQUISITES

NEXT STEPS

USEFUL FOR

Similar threads

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

PHP My website presents the visitor with the choice of opting out of using cookies....

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect