(Complicated?) Probability problem

  • Thread starter KStolen
  • Start date
  • Tags
    Probability
In summary: N / (M + N -1))2)In summary, the average number of iterations needed for at least 50% of the first grid to be covered by squares from the second grid is given by the formula n = log(1 - p) / log(1 - (N / (M + N -1))2), where M > N > 0 and p is the fraction of cells already covered. This solution is based on the probability of a cell being covered in a single iteration and the relationship between the size of the first and second grids.
  • #1
KStolen
14
0
I've got a problem I'd like to solve and I haven't done any probability in years.

Here's the problem.

There are two M*M grids of squares. An N*N mask is placed randomly on the second grid, such that it covers at least one square, but up to N*N squares. (i.e, the majority of the mask may lie outside the grid)

If the covered squares are then copied into the corresponding squares in the first grid, then n is the average number of times I will have to perform this step to have covered at least 50% of the first grid with squares from the second grid.

What is n in terms of N and M?

M > N > 0

I don't even know where to begin. Taking repeating squares into account seems especially difficult. Anyone have any tips for this?
 
Physics news on Phys.org
  • #2
Anyone?
 
  • #3
Tell us whether this is a textbook problem - then we'll have more hope! Offhand it looks like a very hard problem.

I assume you want some kind of formula as an answer. The practical answer would be to do Monte-Carlo simulations.

One suggestion (I won't call it a "tip" because I don't know it would work) is to analyze the problem in the style of induction. Assume you know the formula for an n by n square and see how that answer relates to the formula for an (n+1) by (n+1) dimensional square.
 
  • #4
KStolen said:
I've got a problem I'd like to solve and I haven't done any probability in years.

Here's the problem.

There are two M*M grids of squares. An N*N mask is placed randomly on the second grid, such that it covers at least one square, but up to N*N squares. (i.e, the majority of the mask may lie outside the grid)

If the covered squares are then copied into the corresponding squares in the first grid, then n is the average number of times I will have to perform this step to have covered at least 50% of the first grid with squares from the second grid.

What is n in terms of N and M?

M > N > 0

I don't even know where to begin. Taking repeating squares into account seems especially difficult. Anyone have any tips for this?

The general case is difficult, so let’s look at the simplest case, with N = 1, where the geometry of the currently-covered squares does not matter. At any stage if k squares are covered (i.e., already selected) then the probability that a new square will be covered next is p(k) = 1 – k/M^2, while the probability that we just re-cover an existing square is q(k) = k/M^2. This is a Markov chain with states 0, 1, 2, …, M^2 (representing the number of currently-covered squares). At time 0 we start in state 0, then move to state 1 with probability 1 at time 1. From state 1, a time-step either takes us back to state 1 again (probability = q(1)) or takes us to state 2 (probability = p(1)). From state 2 we stay in state 2 with probability q(2) or move to state 3 with probability p(2), and so forth. Given a particular K (say near M^2/2) you want the expected number of steps needed to reach state K, starting from state 0. This is the expected first-passage time, and can be found by a fairly straightforward recursive procedure, as follows. Let x(j) = expected first passage-time from state j < K to state K. We have x(K-1) = 1 + q(K-1)*x(K-1), so x(K-1) = 1/p(K-1) [remember: 1-q(K-1) = p(K-1)]. For j < K-1 we have x(j) = 1 + q(j)*x(j) + p(j)*x(j+1), hence x(j) = [1 + p(j)*x(j+1)]/p(j) = x(j+1) + 1/p(j). So, from x(K-1) we can easily get x(K-2), then can get x(K-3),… and eventually get x(0), which is the answer we want. With a bit of work we get Answer = x(0) = 1 + 1/p(1) + 1/p(2) + 1/p(3) + ... + 1/p(K-1). [You could also get this by recognizing that the number of steps needed to reach a new square has a geometric distribution, so we need the expected value of independent but non-identically-distributed geometric random variables.]

Even the case N = 2 is much, much more difficult. Now not only is the area of already-covered squares important, but also their “geometry” (that is, the shape of the figure already covered). It could be that the problem is sufficiently hard that the most reasonable approach to it would be to do Monte-Carlo simulation (basically, an experimental method).

RGV
 
  • #5
Tell us whether this is a textbook problem - then we'll have more hope! Offhand it looks like a very hard problem.

Unfortunately it's not a textbook problem - it came up in a programming project.

Thanks to both of you - Ray and Stephen. I hadn't realized it was so difficult for the general case. I had hoped for a formula into which I could plug in my values, instead of having to determine the value of n experimentally. I'm able to simulate it fairly easily, so I suppose that will have to do.

Thanks again :)
 
  • #6
I started off by thinking the analytical soltion was highly complicated, requiring the analysis of many different branching possibilities. But now I wonder...

The NxN square can be laid in a total of (M + N - 1)2 positions.

Each cell of the MxM square is covered by a total of N2 different positions of the NxN square.

The probability of any individual cell of the MxM square being covered in a single iteration is therefore
N2 / (M + N - 1)2.

After n iterations, the probability of any individual cell NOT being covered is therefore
(1 - N2 / (M + N - 1)2)n.

When a fraction p of the cells are covered, 1 - p are not covered so we have
1 - p = (1 - N2 / (M + N - 1)2)n

n = log(1 - p) / log(1 - (N / (M + N -1))2)

If you are going to do some Monte Carlo simulation I would love to so how it compares.
 
  • #7
Hmmm, don't need Monte Carlo simulation, it's fairly easy to enumerate the (M + N -1)2 possible placings of the tile as MxM matrices and then simply add matrices together n times for a total of (M + N - 1)k branches where k = 2n.

The formula (1 - (N / (M + N - 1))2)n gives the expected value of the proportion of cells not covered after n placings. Taking M = 5, N = 3 and n=3 I have verified that the mean number of cells not covered is 64000/117649 (note 117649=492 ) exactly as given by the formula.

Looks like my intuition about scale/translation invariance was correct.
 
  • #8
MrAnchovy said:
Hmmm, don't need Monte Carlo simulation, it's fairly easy to enumerate the (M + N -1)2 possible placings of the tile as MxM matrices and then simply add matrices together n times for a total of (M + N - 1)k branches where k = 2n.

The formula (1 - (N / (M + N - 1))2)n gives the expected value of the proportion of cells not covered after n placings. Taking M = 5, N = 3 and n=3 I have verified that the mean number of cells not covered is 64000/117649 (note 117649=492 ) exactly as given by the formula.

Looks like my intuition about scale/translation invariance was correct.

Your formula does not give the correct result for n (the mean number of steps until a certain fraction of cells are covered) in the case of n = 1, although it *does come close* numerically. Given M = m^2 cells, if we want to cover K cells, the expected number of steps we need is
S = sum(M/(M-k),k=0..K). Basically, the n = 1 problem is much like the classical Coupon Collector's Problem, but instead of asking for how many coupons we need to have a complete set we merely ask for an incomplete set (K out of M). For more on the coupon collector's problem, see http://en.wikipedia.org/wiki/Coupon_collector's_problem .

RGV
 
  • #9
Ah yes, I have answered a different question - thanks for pointing it out.

I have determined the number of steps required to achive a mean coverage of 50% of the cells, not the mean number of steps required to cover 50% of the cells.

Oops!
 
  • #10
Ah yes, I have answered a different question - thanks for pointing it out.

I have determined the number of steps required to achive a mean coverage of 50% of the cells, not the mean number of steps required to cover 50% of the cells.

Oops!

Hey, thanks for the reply. Although I initially thought that I wanted the mean number of steps required to cover 50% of the cells, I think the number of steps required to achive a mean coverage of 50% of the cells might be better.

If n is the mean number of steps required to cover 50% of the cells and
o is the number of steps required to achieve a mean coverage 50% of the cells,

then would I be correct in saying that in several iterations of o steps, the number of cells covered in each step would be closer to 50% than in several iterations of n steps?
 

What is probability?

Probability is a measure of the likelihood that an event will occur. It is typically expressed as a number between 0 and 1, with 0 representing impossibility and 1 representing certainty.

What is a complicated probability problem?

A complicated probability problem is a problem that involves multiple variables, complex scenarios, and/or difficult calculations. These types of problems often require advanced mathematical knowledge and critical thinking skills to solve.

How do you approach a complicated probability problem?

To approach a complicated probability problem, it's important to carefully read and understand the problem, identify the given information and what is being asked, and then determine which probability formula or method is most appropriate to use. It may also be helpful to break the problem down into smaller, more manageable steps.

What are some common mistakes to avoid when solving a complicated probability problem?

Some common mistakes to avoid when solving a complicated probability problem include not carefully reading and understanding the problem, using the wrong probability formula or method, and making calculation errors. It's also important to make sure all assumptions and conditions are met before applying a probability formula.

How can I improve my skills in solving complicated probability problems?

To improve your skills in solving complicated probability problems, it's important to practice regularly, review basic probability concepts and formulas, and seek help from others if needed. It may also be helpful to read and solve different types of complicated probability problems to gain a better understanding of how to approach them.

Similar threads

  • Calculus and Beyond Homework Help
Replies
3
Views
550
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
2K
  • Introductory Physics Homework Help
Replies
6
Views
204
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Calculus and Beyond Homework Help
Replies
9
Views
4K
  • Calculus and Beyond Homework Help
Replies
14
Views
2K
  • Calculus and Beyond Homework Help
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
1K
Back
Top