Hash Tables + probability question

  • Thread starter FlashStorm
  • Start date
  • Tags
    Probability
In summary, the homework question is asking for a lower bound on the length of a probe sequence, and the student was unable to find a solution.
  • #1
FlashStorm
17
0
Now before I'll post the question , I want to be clear. Technically , this WAS a homework question. BUT since I don't have to submit it (because I finished with the homework submitting process , and now studying for the test), I think it is legitimate to post here the homework file + I am doing it because I couldn't really solve it. (and there is no official solutions because our lecturers and his assistance are pretty god damn lazy:( )

now this is question 6 (with the hash table of size 2n to store n elements).
I've solved the first three sub-questions:

the first one is upper bounded by p=1/2 and it distributes as geometric random variable.
the second one is just substituting in the first sub question.
the third one is basically the sum of all n Xi insertions with the answer of sub question b.

Now, the forth questions deals somehow with Markov's inequality and about to prove that the expected length of probes is O(logn).

So I know that Markov's inequality means :
P(x>=t)<= E(X)/t

and although i tried to used this fact to get an upper bound , I couldn't find any solution or a lead.

Note: This is a new question which didn't appear on previous years assignments. If you have a reason to believe this question was made inappropriately or maybe even some mistakes.

Looking forward for your help
and thank you,
Aviv
 

Attachments

  • ex8.pdf
    45.8 KB · Views: 248
Physics news on Phys.org
  • #2
Rewrite P(x>t) < E(X)/t as t*P(x>t) < E(X), or E(X) = t*P(x>t) + v for some v > 0. Show that t*P(x>=t) + v is O(Log n).
 
  • #3
It was the first thing i have done, obviously. yet couldn't get any results.

according to the questions P(X>=t)<= n/2^t (according to question 6.1 the probability to probe more then k times is 2^-k , sum it n times and you will get n/2^t).

So that leads us to nt/2^t<=E(X).

after doing log on both sides (log2) i get logn+logt-t<=log (E(X)). since t can be any number, I can't block it or take it out. I lack insights about that situation. But even if I do manage to get E(X)>={log expression}. All I get is a lower bound. then I will have to show an upper bound.

:(
Thanks in advance
 
Last edited:
  • #4
If t [itex]\approx[/itex] 1, then log t [itex]\approx[/itex] t - 1.
 
  • #5
I don't think t has any estimations, why would t[tex]\approx[/tex]1 . t is the number of probes
but even if it had anything like it i would get that E(X)=[tex]\Omega[/tex](n), which is the complete contrary.
 
  • #6
Are you saying that what you have does not work for t = 1?
 
  • #7
Yes, I think at least. t is the number of probes on the array. and the question says "use markov's inequality to prove that the longest probe sequence is E(X)=O(logn)".

there is no reason to believe that t = 1 (it should even be much bigger to my opinion), and even if it were true, i would still get the wrong answer.
 
Last edited:

1. What is a hash table and how does it work?

A hash table is a data structure that stores data in key-value pairs. It uses a hashing function to map the key to a specific index in the table, making it efficient to retrieve and insert data. When data is inserted, the key is hashed and the resulting index is used to store the value. When data is retrieved, the key is hashed again and the corresponding index is searched for the value.

2. What are the advantages of using a hash table?

Hash tables have a constant average time complexity of O(1) for both insertion and retrieval, making them efficient for storing and retrieving large amounts of data. They also have a smaller memory footprint compared to other data structures, as they do not require additional pointers or overhead for storage.

3. Can collisions occur in a hash table and how are they handled?

Yes, collisions can occur in a hash table when two different keys result in the same index. This is known as a hash collision. There are several methods for handling collisions, such as separate chaining and open addressing. Separate chaining involves storing multiple values in the same index using a linked list, while open addressing involves finding an alternative index to store the value.

4. How does the size of a hash table affect its performance?

The size of a hash table can affect its performance in terms of collisions. A smaller table may result in more collisions, while a larger table may have more empty spaces. It is important to choose an appropriate size for the data being stored to minimize collisions and maintain efficiency.

5. What is the probability of a collision occurring in a hash table?

The probability of a collision occurring in a hash table depends on the size of the table and the number of keys being inserted. As the number of keys increases, the probability of a collision also increases. However, with a well-designed hashing function and an appropriately sized table, the probability of collisions can be reduced to a negligible amount.

Similar threads

  • Set Theory, Logic, Probability, Statistics
2
Replies
57
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
0
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
969
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
338
  • Set Theory, Logic, Probability, Statistics
Replies
33
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
954
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
4
Views
2K
Back
Top