Probability of Outcomes in Merge Shuffle

  • Thread starter Thread starter ardentra
  • Start date Start date
  • Tags Tags
    Probability
ardentra
Messages
1
Reaction score
0
So I've been banging my head on this problem for a few days and haven't gotten very far, hoping someone has some insights. I know this sort of reads like a homework problem, but this problem is the result of personal investigations and does not come from an assignment.

So for those of you who don't know, Merge Sort is an algorithm that works by taking a list of elements to be sorted, L, and splitting L into halves L1 and L2, then splitting those halves, repeatedly until reaching a set of lists of length 1. Then the lists are recombined with correct ordering until once again arriving at one list, which is now sorted. Lists are not sets: order matters, and in this case you can only add items onto the front of lists.

So say you had the list of numbers 3,9,8,0,2,1,5,7,4,6. Performing Merge Sort would look like this(I removed commas for convenience):

3980215746

39802 15746

39 802 15 746

3 9 8 02 1 5 7 46

3 9 8 0 2 1 5 7 4 6

39 8 02 15 7 46

39 028 15 467

02389 14567

0123456789

During the merge steps, items are selected one at a time from the left side of both L1 and L2, compared to one another, and the item with the smaller value is added to the new list. This step is repeated until the lists have been combined.
So a single merge step would operate like this for lists L1 = 2,3 and L2 = 1,4:

Compare 1 and 2
Add 1 to the recombined list
Compare 2 and 4
Add 2 to the recombined list
Compare 3 and 4
Add 3 to the recombined list
Add 4 to the recombined list


So that is Merge Sort. Merge Shuffle works the same way, except that during the recombining stage, there is a 50% chance of a number being taken from L1 and a 50% chance of a number being taken from L2 for addition to the recombined list.
So combining lists L1 = 1,2 and L2 = 3,4 might operate like this:

Select 1 or 3
Add 3 to the recombined list
Select 1 or 4
Add 1 to the recombined list
Select 2 or 4
Add 4 to the recombined list
Add 2 to the recombined list

Which would create the newly shuffled list L = 3142.

I am fairly sure that at a single add step during a merge, the probability of an element at position e in list E = E1... ...En of length n ending up in position t of the target list T = T1, T2... where x positions in T have already been filled and d = t - x is equal to:

if e > d then probability = 0
if e + o < d then probability = 0
otherwise probability = .5d

Where o is the number of items remaining in the other list being merged.

So this is the question I want to solve: Given input list I of length n, and I containing no repeated items, what is the probability that performing Merge Shuffle on I produces I as the result?

And more generally, given input list I of length n, and I containing no repeated items, what is the probability of producing the given result LI?
 
Last edited:
Physics news on Phys.org
If n is a power of 2, this looks easy to analyze.
If n is not, that looks ugly - you have to keep track of all combinations of k with k+1 elements somewhere. Let f(I) be the result of a merge shuffle. I think f(I)=I can be studied with some casework (look at n=2, n=3, n=4, n=5, ..., try to find a recursion formula), but I don't know how to get the more general result.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top