# Genetics-hardy-weinberg related question

genetics--hardy-weinberg related question

Assume a sex-linked trait that does not kill before mating. Using X for X-chromosomes and Y for Y chromosomes, x for an x chromosome bearing the trait in question; assuming also that all females mate with all males in their own generation only, if a female carrier (xX) mates with a normal male (XY), F1 looks like : XX, Xx, XY, xY. The male xY has the trait, so 1/4 of F1 has the phenotype. If F1 is allowed to mate randomly, F2 looks like: 4 xX, 3 XX, 1 xx, 6 XY, 2 xY, for 16 offspring. So two males express the trait in F2. Absent an arithmetic error, 46 males out of the 256 in F3 have the trait.

My question is, does the proportion of afflicted males stabilize for F_n as n gets large?

This seems related to H-W equilibrium or maybe fixed-point problems, but I don't see an obvious way to put the problem in simple mathematical terms.
_______
Defining XX = a, Xx = b, xx = c, XY = d, xY = e, with the convention that a-e on the LHS have subscript _n+1 and on the RHS a-e have subscript _n.

b = 2ae + bd + be + 2cd + ce
c = be + ce
d = 2ad + 2ae + bd + be + ce
e = bd + be + 2cd + ce

If F_1 has cardinality xX = XX = xY = XY = 1 this iterative system seems to give the correct answers. I am cross-posting this question to Linear Algebra.

Last edited:

As an afterthought, I think the Hardy-Weinberg law does not apply, since the requirement that F_n only mate within its generation (etc.) makes the mating nonrandom.

epenguin
Homework Helper
Gold Member

It's hard. No it's easy. No it's hard. Well it's easy once you've seen it. I think. (It's late nite here.)

At all times you're going to have half males, half with a Y. So only worry about that half. When you're at equilibrium what's available in what fractions to team up with the Y?

Then clarify to yourself why you're not at equilibrium in those two cases you already calculated.

If you can calculate the progression towards equilibrium that would be magnificent.

It's hard. No it's easy. No it's hard. Well it's easy once you've seen it. I think. (It's late nite here.)

At all times you're going to have half males, half with a Y. So only worry about that half. When you're at equilibrium what's available in what fractions to team up with the Y?

Then clarify to yourself why you're not at equilibrium in those two cases you already calculated.

If you can calculate the progression towards equilibrium that would be magnificent.

It's early here and coffee is scarce. I will look at this again in light of your note. The aspect of your note that confuses me is the phrase "when you're at equilibrium," and then, "the progression towards equilibrium." This seems to be a contraction mapping and you are suggesting it's easy to see if we only look at males. So we are never at equilibrium, but approaching it. At any rate thanks for the hint. I shall return.

epenguin
Homework Helper
Gold Member

I shall return.

Hope so. I hate the people who never come back when it gets interesting. To correct and make things clearer I am re-labeling. In general there are six possible crossings, labeled below as p,q,r,s,t,u. There are five types of individuals, labeled below as a,b,c,d,e. We can find F_n+1 from F_n in the following way.

If I want to know how many of type a there will be in F_n+1, note that each cross of type p produces 2 of type a and each cross of type q produces one a. No other crosses result in type a progeny. This can be expressed:

a' = 2p + q.

In this way we arrive at five equations:

a' = 2p + q
b' = q + 2r + s + t
c' = s + 2u
d' = q + s + 2p + 2t
e' = q + 2r + s + 2u.

We happen to know the first values for a,b,c,d,e, (they are 0,1,1,1,1) and these give us our first values for p,q,...,u. Using the dictionary below, p = a*d = 0*1 = 0; q = b*d = 1*1 = 1. So

a' = 0 + 1 = 1.

In this way we find that a'=1, b'=4, c'=3,d'=2,e'=6. These updated figures can be used to find the next values for p,q,r,s,t,u. We see that there are two afflicted males in F2, out of 16 individuals.

This does not address the question of convergence, but I think linear algebra has an answer to that question. Is it easy? Yes. Simple? No. I did try looking at the males alone. I do not see a simpler approach and will be interested to hear more about it.

___________________________
Dictionary

xx=a, xX=b, XX = c, xY=d, XY=e.

p:= a*d = xx,xx,xY,xY
q:= b*d = xx,xX,xY,XY
r:= c*d...etc.
s:=b*e
t:=a*e
u:=c*e

epenguin
Homework Helper
Gold Member

OK, that's an effort and I will tell you what I thought. Subject to criticism (simple-seeming probabilistic calculations are notoriously often with traps).

First, in case your problem is not clear to other readers as it was not quite to me, I am assuming we are calculating as if we were starting from a large population of Xx females and XY males. We know the overall frequencies of chromosome types and we have to calculate the final equilibrium of the different diploid types, and hopefully the dynamics of getting towards that.

Now at all times, including at equilibrium there are equal numbers of males and females in the population. So the number of people with one Y always equals that with none. So consider the half with one Y (males). They have to get into their cell an X-type chromosome, either X or x - there aren't any YY people. These X-types are present in the population in the ratio X:x = 2:1. So a third will get x, a third of the males and 1/6 of the population will be xY it seems to me.

I think you can probably use this way of looking to get much simpler equations that can be solved.

Anyway you ought to write before it fades why it does take time, why you are not at equilibrium at F1?

To answer your later question if you set initial conditions as equilibrium then it stays there. (In practice if it is a stable equilibrium, in theory always.) If you start away from equilibrium then with time you approach it nearer and nearer but never get exactly there. This is true in chemistry, population genetics and other places. (Except when it isn't, but that is for much more advanced classes. )

Thanks.

No, I'm starting from Xx*XY, a carrier female and a normal male. If we started with a large population I think we'd have H-W conditions and statistical stability. Instead, we begin with one pair. We can take as our original vector F_2 = {0,1,1,1,1}, or F_1 = {0,1,0,0,1}. Each generation can only mate with peers. The proportion of Xy in F_1 is 1/4, which shrinks to 48/256 in F_3, so for sure we don't have stability initially.

I do think that one could prove |F_n - F_n+1| < F_n-1 - F_n| or something like this, suggesting a limit for F_n as n-> oo.

So given this, I don't see that 1/3 of males are xY....

epenguin
Homework Helper
Gold Member

Thanks.

No, I'm starting from Xx*XY, a carrier female and a normal male. If we started with a large population I think we'd have H-W conditions and statistical stability. Instead, we begin with one pair. We can take as our original vector F_2 = {0,1,1,1,1}, or F_1 = {0,1,0,0,1}. Each generation can only mate with peers. The proportion of Xy in F_1 is 1/4, which shrinks to 48/256 in F_3, so for sure we don't have stability initially.

I do think that one could prove |F_n - F_n+1| < F_n-1 - F_n| or something like this, suggesting a limit for F_n as n-> oo.

So given this, I don't see that 1/3 of males are xY....

Re-viewing that it gets simpler. Without my long phrases, when things are fully mixed (i.e. at equilibrium) 1/3 of the males are xY just because 1/3 of the sex chromosomes around that they have to have besides Y are x!

The reason I said large populations - I said 'as if' - was not to have to talk about more complicated things like statistical distribution of possibilities. I mean that has to be the assumption to get the distributions you mentioned in your first paragraph. Or else you should have said 'the most probable distributions'. If you as you now say you begin with one pair there is a definite chance they will produce things like only xY male progeny, only XY male progeny or for that matter only male progeny or only female progeny, chances that diminish the larger the population. (You see how you can lose traits easier in smaller populations). Instead of a large population you can ask for most probable outcome, I think it amounts to the same thing - that thinking of a large population is how you workout the most probable outcome.

I haven't done it but I think if you look at it in this simple way you could quite easily work out an equation relating, say (xY)n+1 to (xY)n and not only find your convergence, but solve it.

" If you as you now say you begin with one pair there is a definite chance they will produce things like only xY male progeny, only XY male progeny or for that matter only male progeny or only female progeny, chances that diminish the larger the population."
________

I should have been clearer about the nature of the question. I am requiring that each cross result in its expected progeny. Xx*XY gives Xx,XX, Xy,XY. No more, no less. The question is completely hypothetical. Otherwise --as you note --the equations make little sense, and for the reason you give!

Well, the question is not completely hypothetical. We can always ask the extent to which populations behave like this, and, to that extent, we may be able to draw conclusions based on this admittedly contrived model.

epenguin
Homework Helper
Gold Member

I don't think it is that contrived if we define the problem as finding the most probable outcome from two parents who found an isolated expanding population - a good starting point for other more difficult questions at least.

xx=a, xX=b, XX = c, xY=d, XY=e.
________________________________
In good conscience, in case someone tries to do this, I will put down in terms of a,...,e the equations that actually finally worked for me in Mathematica.

b'=2ae+bd+be+2cd
c'=be + 2ce
e'=2ae+bd+be+2cd+2ce

It's a bit of book-keeping. The sum (a+b+c+d+e)_n should look like 4,16,256,256^2,...etc.

This is a book-keeping headache. Hopefully final corrected set:

b'=2ae+bd+be+2cd
c'=be + 2ce
d'=bd + be +2ad + 2ae
e'=bd+be+2cd+2ce

Re-viewing that it gets simpler. Without my long phrases, when things are fully mixed (i.e. at equilibrium) 1/3 of the males are xY just because 1/3 of the sex chromosomes around that they have to have besides Y are x!
[snip]
I haven't done it but I think if you look at it in this simple way you could quite easily work out an equation relating, say (xY)n+1 to (xY)n and not only find your convergence, but solve it.

The last set of equations quickly shows a:b:c:d:e approaches 1:4:4:3:6, but on reviewing your note, I finally see that if we know that the proportion of x:X in each generation is 1:2, we can use elementary probability to get these (equilibrium) ratios. For example, probability of being a female times the (independent) probabilities of getting x then x is (1/2)(1/3(1/3) = 1/18.

Thanks for that insight.

epenguin
Homework Helper
Gold Member

Yes, you have noticed as I also with delay did, that you have an unusual 1/9 of the females affected at equilibrium because homozygous. This is because everyone in your scenario is descended from just one couple with no other input. Most usually the fraction of females affected by a negatively selected trait is very small because the total fraction of affected chromosomes is already small, so their square is very small.

It would still be nice if we could find the dynamics of progress towards that equilibrium, and I guess if you concentrated on the males and then on how the X or x gets into their genome you could easiest get equations much simpler than the ones you gave. (I do not know what 'worked for me in Mathematica' means.)

Since there is no selection or randomness in your model you can think of it as a 'conservation of chromosome proportions' law.

BTW I got 48 not 46 in F3.

Last edited:

"...BTW I got 48 not 46 in F3."
______________

Yes it's 1,2,48, 10240,...,etc. "Worked for me in Mathematica" means that I programmed the equations in Mathematica and entered a set of initial conditions (a vector). It's easy to do the first few generations by hand but to see the variables approach the theoretical distribution you might have to get through 8-9 generations (a daunting task for a mortal). It's not a proof that a limiting value exists, much less a proof of that value, but it's a quick practical check and only takes a second to get 15 generations. At 15 generations d/(a+b+c+d+e) is about 0.166672 (a lot like 1/6) and d is an integer with more than 9800 digits.

Yes it would still be interesting to get the dynamics of progress toward the limit. I will think about that. I'm not sure there are simpler equations but maybe so. One reason I don't think the equations simplify is that the approach of d to 1/6 (1/3 of males) depends on the relative availability of x and X in the entire population. But I'm not sure.

epenguin
Homework Helper
Gold Member

It great when computer calculations come out so close to what you predicted mathematically - you feel the chances of that happening by accident are very small.

:uhh: Better not shout Eureka yet, but I've now got a simple formula that for F1, F2, F3, agree with your calculated figures. Tell me whether your computer calcs. give you for x/(x + X) ratio in the males, F4 0.3125, F5 0.34375 ? I did not use Mathematica nor even a scientific calculator.

Note the trend is not monotonic - neither are your first three.

One has to try and find a simple point of view. I suggest looking above all at the males and trying to formulate in terms of the total fraction of maternal chromosome in the males that is x. That will avoid the huge numbers you were dealing with.

Better not shout Eureka yet, but I've now got a simple formula that for F1, F2, F3, agree with your calculated figures. Tell me whether your computer calcs. give you for x/(x + X) ratio in the males, F4 0.3125, F5 0.34375 ? I did not use Mathematica nor even a scientific calculator.

Note the trend is not monotonic - neither are your first three.

One has to try and find a simple point of view. I suggest looking above all at the males and trying to formulate in terms of the total fraction of maternal chromosome in the males that is x. That will avoid the huge numbers you were dealing with.
____________

Yes, right again. I will look at this.

One has to try and find a simple point of view. I suggest looking above all at the males and trying to formulate in terms of the total fraction of maternal chromosome in the males that is x. That will avoid the huge numbers you were dealing with.

While I am looking for your simpler formula for affected males, it may be that the equations are the simplest possible for studying the effect of differing starting proportions generally. For example, we might ask what is the effect on the ultimate proportion of d = xY when we vary the starting proportion of b =xX females slightly. Borrowing two of your symbols, Δd/Δb or more generally ∂P/∂b, where P is the final population {a,b,c,d,e}_final, might be of interest. For example, if I increase the proportion of xX by 20% in F2 (which was 0,1,1,1,1), there is only a 4-5% increase in the final proportion of xY. One could also ask whether any of the fixed points (stable final populations) are "attractors" in the sense of being a common solution of different initial populations (I don't think so).

The equations also allow for the inclusion at the start of xx (recessive) females, in any desired proportion.

Here is I think an interesting pattern: If the starting vectors are listed as rows:

xx xX XX xY XY
0 1 1 1 1
1 0 1 1 1
1 1 0 1 1
1 1 1 0 1
1 1 1 1 0, The final population proportions are:

1 4 4 3 6
1 2 1 2 2
4 4 1 6 3
1 4 4 3 6
4 4 1 6 3 from which we see immediately that taking out the "normal"male XY has

the same effect as taking out the non-carrier female XX, or that taking out the recessive female has the same effect as taking out the afflicted male. All this could be deduced by using elementary probability, but at a cost in time that the computer saves. Assuming of course that the equations are properly derived and that I have programmed properly!

Last edited:
epenguin
Homework Helper
Gold Member

Well I can see you are not frightened to formulate problems mathematically which contrasts with many students, e,g, here - it is hard to get them to write an equation that corresponds to the question or to some of the information they are given. Also you make conjectures. All this is much better than passive learning. However I think you would do well to combine your computer skills with ordinary simple mathematical formulation. I don't know I'd agree the computer saves time relative to a general formula and that latter gives more insight. If you found that formula you would be able to extend to other models e.g with selection, mutation etc.

This would also help you develop that more educated intuition which geneticists have. For instance re your attractor question you should converge the same result from an infinite variety of initial distributions as long as the ratio of total x to total X is the same in all of them, however this x and X is initially distributed between males and females.

Anyway tell me what you want me to do, shall I give you the formula? further hint the method? or do you want to try it yourself? From next weekend I will have to be here less often.

[snip] I think you would do well to combine your computer skills with ordinary simple mathematical formulation. I don't know I'd agree the computer saves time relative to a general formula and that latter gives more insight.

Well, the easiest way is to construct the following three lists:

1--- 3--- 5--- 11---21

3--- 5--- 11---21---43

8---16---32---64---128....

If d_1 is 1/8 and e_1 is 3/8,... inductively, d_6 will be 43/256, and e_6 will be (128-43)/256.
------------------------------------

This is an easy iterative formula for d,e. We could make it look more elegant (barring error)---

num(d_n+1) = [ 2 num(d_n) + (-1^( n+1) ) ] and den (d_n+1) = (8*(2^(n))); num(d_1) = 1

I don't think it's as enlightening as the general formula. What was your formula?

Last edited:
epenguin
Homework Helper
Gold Member

For the fraction of the X-type chromosomes in the males which are x, just

[1 - (-1/2)n]/3

That obviously converges towards 1/3 in alternating fashion.

Hope that helps - it is actually easier to work out answers when you know what they are.

For the fraction of the X-type chromosomes in the males which are x, just

[snip]

That obviously converges towards 1/3 in alternating fashion.

1/3 of the males, you mean? And 1/6 of the population...? I noticed below your said the approach to 1/3 was for xY, but I get 1/3 for XY and 1/6 for xY, with xY as an "afflicted" male? I assumed that was because you were only looking at the males.

My recurrence formula gives both d and e. d is approaching 1/6. e is approaching 1/3. d is xY.

epenguin
Homework Helper
Gold Member

Yes I said all along 1/3 of the males are affected at equilibrium. You can do the calculation in various ways - I just initially found males slightly easier to think about because you only have to think about the presence of one chromosome. When you have them the females are simple arithmetic because 1/3 of all the X-type chromosomes in the population are x in all generations. Thus the fraction of x in males alternates around and converges to the equilibrium fraction; the fraction of x in females also, but is above when males are below the ratio and vice versa.

Well that's my theory, can you confirm you computer calcs gave the same numbers as the formula?

I am not sure about your formula, quite what it means, but I called the fraction of X-type chromosomes that are x in males mn and in females fn, got formula for mn+1, worked through to get a recurrence relation for only m's and it is an easy example standard in all books how you solve such recurrence relations.

Within the females you do at equilibrium have a Hardy-Weinberg distribution: 1/9 xx, 4/9 xX and coincidentally 4/9 XX if I am not mistaken.

Yes I said all along 1/3 of the males are affected at equilibrium.

[snip]
Well that's my theory, can you confirm you computer calcs gave the same numbers as the formula?

[snip]