Java How to make a search engine by Java ?

Todee · May 1, 2013

want some sites to teach me how to develop a simple web-based search engine that demonstrates the main features of a search engine (web crawling, indexing and ranking) and the interaction between them.
Using Java

Adyssa · May 1, 2013

I'm going to suggest that you take the CS101 course over at Udacity because it will teach you exactly what you want to know, how to build a search engine. They teach it using Python, but the code is not complex and you could easily adapt it to Java.

One thing to be aware of. At the end of the course, you don't so much have a working search engine, as you have all the components that are required. The reason they don't give you a working program is because a search engine involves a fair amount of web etiquette - meaning you have the power to hit web servers with thousands upon thousands of requests, and before you unleash yours onto the world, you want to make sure that you are acting in a courteous manner. Particularly in the testing phase.

The actual code for web-crawling and indexing involves making a request to some seed page, getting the HTML back, parsing the HTML for links, and then recursively following those links and parsing the new HTML for more links, until you run out of room in your index, or the links stop.

Ranking can be done in many ways. Google's algorithm is fairly well documented around the web. It basically says, for any page, the rank is a measure of how many other pages link to this page, and the rank of those other pages. A high ranked page linking to your page, increases your rank by a larger factor than a low ranked page linking to your page.

Todee · May 2, 2013

thank you

Java How to make a search engine by Java ?

Thread 'Star maps using Blender'

Thread 'Who is responsible for the software when AI takes over programming?'

Thread 'Leading AI systems blackmailed their human users'

Similar threads

Hot Threads

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Python Complaining About Python

Fortran Reading files in pre-f77 - handling end of file

Sequential Analog Computers?

Recent Insights

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers