Number of Nodes as Neighbors: Probability Question

  • Thread starter Thread starter randomuser11
  • Start date Start date
  • Tags Tags
    Nodes Probability
Click For Summary

Discussion Overview

The discussion revolves around a probability question related to an unstructured overlay network where nodes randomly choose neighbors to flood a request for a file. Participants are trying to understand the probability calculations involved in determining how many nodes will be reached during this flooding process.

Discussion Character

  • Homework-related
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant expresses confusion about the probability part of the solution, particularly regarding the summation and how it relates to counting combinations of nodes.
  • Another participant explains the probability q and provides a breakdown of the formula used to calculate it, suggesting that it enumerates combinations of nodes that are reached.
  • A participant questions why nodes that may be repeated at the third level are not considered, raising concerns about double-counting and the implications for the formula.
  • There is a discussion about whether the formula should account for not sending messages back to the original node, with some suggesting that the problem does not explicitly state this restriction.
  • One participant presents a diagrammatic example to illustrate potential repetition issues at different levels of flooding, prompting further questions about the complexity of the situation.
  • Another participant requests clarification on the diagram, indicating that the visual representation may not be clear to everyone involved.

Areas of Agreement / Disagreement

Participants express varying levels of understanding regarding the probability calculations and the implications of node repetition. There is no consensus on how to handle the counting of nodes, particularly concerning potential double-counting and the specifics of the flooding process.

Contextual Notes

Participants note the complexity of the problem, including the assumptions about node behavior and the potential for repeated nodes in the flooding process. The discussion highlights the need for clarity in the mathematical reasoning and the definitions used in the problem.

randomuser11
Messages
3
Reaction score
0

Homework Statement


Consider an unstructured overlay network in which every node randomly chooses c neighbors. To search for a file, a node floods a request to its neighbors and requests those to flood the request once more. How many nodes will be reached?


Homework Equations




The Attempt at a Solution


I have the solution already (from the textbook) but it doesn't explain it very well. Can you please explain this solution to me? I'm not understanding the probability part (I understand the c*c-1).

TEXTBOOK SOLUTION:
An easy upper bound can be computed as c*(c -1), but in that case we ignore the fact that neighbors of node P can be each other's neighbor as well. The probability q that a neighbor of P will send a message only to nonneighbors of P is 1 minus the probability of sending it to at least one neighbor of P:
q = 1 - \sum\limits_{k=1}^{c-1} \binom{c-1}{k} \left( \frac{c}{N-1}\right)^k \left( 1-\frac{c}{N-1} \right)^{c-1-k}

In that case, this flooding strategy will reach c \times q(c -1) nodes. For example, with c = 20 and N = 10, 000, a query will be flooded to 365.817 nodes.


So, that's the textbook's answer... but I'm really confused! I don't understand the probability part. I mean, I understand what it's trying to do, but I'm totally lost on where the numbers or summation is coming from. I would REALLY appreciate an explanation of this! I think it's trying to enumerate all combinations and give them a probability in some way, but I'm lost at how they arrived at this. Our teacher said this was an "almost trivial" problem... and when I submitted my homework, I submitted c \times (c-1) and then I saw this and almost fell over.

Thanks in advance!
 
Physics news on Phys.org
welcome to pf!

hi randomuser11! welcome to pf! :smile:
randomuser11 said:
The probability q that a neighbor of P will send a message only to nonneighbors of P is 1 minus the probability of sending it to at least one neighbor of P:
q = 1 - \sum\limits_{k=1}^{c-1} \binom{c-1}{k} \left( \frac{c}{N-1}\right)^k \left( 1-\frac{c}{N-1} \right)^{c-1-k}

the probability that the second flood from one node in the first flood will include exactly k nodes from the c-1 other nodes in the first flood (and therefore c-1-k nodes not from the first flood) is

\binom{c-1}{k} \left( \frac{c}{N-1}\right)^k \left( 1-\frac{c}{N-1} \right)^{c-1-k}

isn't it? :wink:
 
I'm confused. :( why don't we consider nodes that are repeated at the third level then? Because some nodes from the second level may also be neighbors with more than one node on the third level? Do we just ignore that case then?

And why wouldn't it be c-2 because we wouldn't send a message back to the node that originally sent it and we wouldn't send a message to ourselves, so wouldn't it be c-2? It's just confusing..
 
hi randomuser11! :smile:
randomuser11 said:
why don't we consider nodes that are repeated at the third level then? Because some nodes from the second level may also be neighbors with more than one node on the third level? Do we just ignore that case then?

because that's double-counting

if you count a repeated node the second time, you're counting it twice!

when you count, it's essential to count everything only once! :smile:

so in practice, what you do is you count them twice, and subtract the number that you counted twice: the ∑ is the number you counted twice
And why wouldn't it be c-2 because we wouldn't send a message back to the node that originally sent it ..

yes, you could: the question doesn't say no-backsies! :wink:
 
My point was something like this. The formula seems to be trying to correct for cases like this (p is the top level node):
p
/ \
n - n

This is the case where c = 2 and the neighbors are repeated on the second level.But what about a case like this:

p
/ \
n n
\ /
n

It seems like we could also have the problem of repetition at the third level, right (as in the above example)? This seems like a different case than the case at the second level...
Or am I just overthinking it?

Thanks!
 
i honestly don't understand your diagrams :confused:

can you say it in words? :smile:
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
566
  • · Replies 1 ·
Replies
1
Views
3K
Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
8K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 6 ·
Replies
6
Views
6K
  • · Replies 1 ·
Replies
1
Views
2K