Register to reply

Random Sampling: Set Inclusion

by Monte_Carlo
Tags: inclusion, random, sampling
Share this thread:
Monte_Carlo
#1
Feb12-12, 06:48 PM
P: 73
Hello,

I don't really know if this is considered a challenging problem but this is not for homework:

You're given a set of numbers S of size n. From S, you draw a random sample A, |A| < n. From S, you draw a random sample B, |B| < n. Sampling doesn't remove items from S. What is the probability that intersection of A and B has size 1?

Monte
Phys.Org News Partner Science news on Phys.org
Apple to unveil 'iWatch' on September 9
NASA deep-space rocket, SLS, to launch in 2018
Study examines 13,000-year-old nanodiamonds from multiple locations across three continents
SteveL27
#2
Feb12-12, 09:31 PM
P: 800
Quote Quote by Monte_Carlo View Post
Hello,

I don't really know if this is considered a challenging problem but this is not for homework:

You're given a set of numbers S of size n. From S, you draw a random sample A, |A| < n. From S, you draw a random sample B, |B| < n. Sampling doesn't remove items from S. What is the probability that intersection of A and B has size 1?

Monte
I didn't finish the problem but I made a good start. I think the probability is

[ 2^(n-1) * (2^n - 2) + n ] / 2^(n+1)

The general idea is to create a multiplication table with all the subsets as row and column headers; then fill in the intersections. When you do this, you can then count the singletons by row. By convention you order the subsets as the empty set first, then all the singletons, etc. That part doesn't matter except that I want the first and last rows of the multiplication table to be the empty set and the entire set, respectively, for the following discussion.

There are 2^n subsets, so we have a 2^n X 2^n table whose grids are filled in with the intersection of the row and column header.

With the convention that the empty set is the top row, the top row of the table contains zero singletons; since every set intersected with the empty set is empty.

The bottom row is the entire set S, which acts as an identity for the operation of set intersection. So the bottom row contains exactly n singletons, one for each element of the set S.

There are 2^n - 2 remaining rows. (There are 2^n total rows, minus the top and bottom).

Claim/guess/conjecture: Each of these remaining rows (not the top, not the bottom) contains exactly 2^(n-1) singletons. This part has to be proved. In other words for each row, exactly half of the intersections are singletons.

If this is true then the total number of singletons is the number of non-top/bottom rows, which is 2^n - 2; times the number of singletons for each of those rows, which is 2^(n-1); plus the singletons in the bottom row, which is n. The sum of those three terms is the total number of singletons; and to get the probability you just divide by the total number of possible pairings of 2^n subsets, which is 2^(n+1).

Perhaps something I wrote will inspire someone to see through to the end of this.

(edit) My conjecture's wrong, but there's a pattern to how many singletons there are per row. It amounts to asking: for a given subset of a set, how many other subsets contain exactly one of the elements of the first set and no others. For example for some S with more than 3 elements, if you pick the subset {1,2,3}, you are asking: how many other subsets of S contain exactly one of either 1, 2, or 3. This sounds like it must be a well-known combinatorial problem.
awkward
#3
Feb13-12, 07:16 PM
P: 327
Here is a combinatorial approach. We will assume [itex]|A| + |B| \leq n + 1[/itex].

There are [tex]\binom{n}{A} \binom{n}{B}[/tex] ways to select the two subsets, all of which we assume are equally likely. We want to count the number of arrangements in which A and B have exactly one element in common. There are n ways to pick the common element. Then there are [tex]\binom{n-1}{|A|-1}[/tex] ways to select the remaining elements of A. Except for the one common element, A and B are disjoint, so there are [tex]\binom{n-|A|}{|B|-1}[/tex] ways to pick the remaining elements of B. So there are
[tex]n \binom{n-1}{|A|-1} \binom{n-|A|}{|B|-1}[/tex]
ways to pick the elements of A and B, and the probability that A and B will have exactly one element in common is
[tex]p = \frac{n \binom{n-1}{|A|-1} \binom{n-|A|}{|B|-1}}{\binom{n}{A} \binom{n}{B}}[/tex]
which simplifies to
[tex]p = \frac{|A| \; |B| \; (n-|A|)! \; (n - |B|)!} {(n - |A| - |B| + 1)! \; n!}[/tex]

Monte_Carlo
#4
Nov14-12, 06:26 PM
P: 73
Random Sampling: Set Inclusion

Could you check your logic with the actual computation? I know the answer is correct, but I'm not quite sure it follows from your calculation.
Monte_Carlo
#5
Nov14-12, 06:30 PM
P: 73
Never mind, I got it.


Register to reply

Related Discussions
Down sampling, bandpass sampling theorem, downconversion Electrical Engineering 1
Projection or Inclusion? General Math 4
Inclusion exclusion Calculus & Beyond Homework 0
Sampling random numbers from a distribution Set Theory, Logic, Probability, Statistics 2
Stratified Random Sampling - Am I wrong? Academic Guidance 0