Markov Chain aggregation method

Click For Summary
SUMMARY

The discussion focuses on implementing a Markov Chain aggregation method to derive the top 10 search results from three different search engines. The process begins with a uniform distribution across a set of 30 results and transitions between states based on ranking preferences. The transition matrix P is defined such that P(i, j) reflects the majority preference for page j over page i, while a random jump mechanism with a probability of δ = 1/7 ensures a unique limiting distribution. The final sorting of results is based on the non-increasing values of this limiting distribution, as outlined in the referenced paper by Dwork et al.

PREREQUISITES
  • Understanding of Markov Chains and their applications in ranking systems
  • Familiarity with transition matrices and their properties
  • Knowledge of probability theory, particularly in the context of random processes
  • Basic comprehension of PageRank algorithms and their implementations
NEXT STEPS
  • Study the concept of Markov processes and their limiting distributions
  • Learn about constructing and analyzing transition matrices in Markov Chains
  • Research the role of random jumps in algorithms like PageRank
  • Examine the paper by Dwork et al. for deeper insights into ranking methodologies
USEFUL FOR

This discussion is beneficial for data scientists, algorithm developers, and researchers interested in search engine optimization and ranking algorithms using Markov Chains.

adohertyd
Messages
1
Reaction score
0
I am using a Markov Chain to get the 10 best search results from the union of 3 different search engines. The top 10 results are taken from each engine to form a set of 30 results.

The chain starts at State x, a uniform distribution of set S = {1,2,3,...30}. If the current state is page i, select page j uniformly from the union of the results from each search engine. If the rank of j < rank of i in 2 of the 3 engines that rank both i and j, move to j. Else, remain at i.

I understand the above no problem. The sorting point is where I am stuck however. The paper I am using that explains this says:

This is known as a Markov process, where the transition matrix P has P(i, j) = \frac{1}{n} if a majority of the input rankings prefer j to i, and P(i, i) = 1−Ʃj≠i P(i, j). Under certain conditions, this process has a unique (up to scalar multiples) limiting distribution x that satis es x = xP, where x(i) gives the fraction of time the process spends at element i. Dwork et al. propose sorting the elements by non-increasing x(i) values. To ensure that the process has a unique limiting distribution x, we use a "random jump": with probability \delta > 0, we will choose a random element and move to this element (regardless of whether this element is preferred to the current element). In our experiments we have used  \delta= \frac{1}{7} , which is the value of \delta that is often chosen in the literature for PageRank implementations.

Could someone please explain this to me in plain english because I am completely lost with it at this stage.
This paper can be found http://www.siam.org/proceedings/alenex/2009/alx09_004_schalekampf.pdf with this specific part on page 43.
 
Physics news on Phys.org
Is this the part that you need explained?:

adohertyd said:
Under certain conditions, this process has a unique (up to scalar multiples) limiting distribution x that satises x = xP, where x(i) gives the fraction of time the process spends at element i.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
14K