Hello. I am having trouble getting a grasp around Hardy-Weinberg Allele Frequencies. I understand that p +q = 1 and that p= freq. of dominant allele whereas q= freq. of recessive allele. I understand that the next part was derived by squaring the equation above: p^{2}+2pq + q^{2}=1 However, I do not understand how p^{2}, 2pq, and q^{2} correspond to genotype frequencies. I created a scenario to test the equations and to better my understanding, but I don't know where I messed up: I took a population of seven individuals & imagined we are examining eye color. For the sake of simplicity there are only two possible alleles for eye color, brown (B) and green(g). Brown is dominant over green. There must be a total of 14 alleles then (b/c there are 7 individuals) and I assigned brown to 6 alleles and green to 8 alleles. In this scenario, therefore, p= 6/14 and q= 8/14 Now, according to the H-W, by squaring p, I should get the freq. of the homozygous dominant genotype. p^{2}= 0.1837 However, I could have the following genotypes, which would give me a different freq. of BB. I could have: Bg gg Bg gg Bg BB Bg These combinations are consistent with the starting numbers 6 Bs and 8 gs. In this case, only 1 out of 7 genotypes is BB, giving a freq. of 0.1429, not 0.1837. I know have messed up in logic somewhere and am missing an important piece of understanding. Any pointers would be greatly appreciated.
p^2, 2pq and q^2 are expectation values. If you take a large population (7 is not large... consider thousands or millions), it is very likely (but not guaranteed) that the actual numbers are close to those expectation values, assuming the two alleles of a person are independent.
Maybe you are confusing between chromosomes and chromatids? Sister chromatids are normally genetically identical: for the purpose of what W-H covers therefore, a diploid organism's genetic constitution at a locus with two alleles is described by pp, pq, or qq as only possibilities.
Although the allele frequencies in this population are p= 6/14 and q= 8/14, you have hit upon an important point. For any given set of allele frequencies, you can have very different sets of genotype frequencies. For example, you could still have the population: BB, BB, BB, gg, gg, gg, gg where there are no heterozygotes. The frequency of BB = p^{2}, Bg = 2pq, and gg = q^{2} only in the case where the population is at Hardy-Weinberg equilibrium. Hardy-Weinberg equilibrium assumes that you have a large population and that breeding occurs randomly among the different genotypes (so, for example, one genotype cannot have a selective advantage over other genotypes). Having a population that differs from that particular distribution of genotypes just means that the population is not at equilibrium. For example, the example above with no heterozygotes could result if breeding does not occur randomly, such that brown eyed individuals breed only with brown eyed individuals and green eyed individuals breed only with green eyed individuals.
I understand that the Hardy Weinberg equation is apt for ideal populations. But how does having a larger population make the situation more ideal? Even if I used 7 million individuals, there is still a chance that 8 million of 14 million alleles could be green alleles (g) and that 1 million of 7 million genotypes could be BB. Also, how does squaring an allele frequency produce a genotype frequency? What is the mathematics or statistics behind this?
Very elementary probability. There is a fraction p of chromosomes that are a given allele, A, say. When you've got two chromosomes fraction p of one is A. The fraction of the other that is A is also p. Taking two at a time, fraction of pairs that are AA ail be p^2. Etc.
The large population assumption is required for the Hardy-Weinberg equlibrium because very small populations are much more prone to allele frequencies changing over time due to random genetic drift.
Thanks Ygggdrasil, epenguin, and mfb. Okay, so now I understand why a large population gives more ideal results, but as far the mathematics of genotype frequency...... In order to arrive at the genotype frequency of BB, why do we not multiply 6/14 by 5/13 because, after all, the first chromosome will already be taken up by one of the 6 alleles and this allele can not be paired with itself b/c it is an individual allele( there is only one of it)? Therefore, we have to subtract one from both the denominator and the numerator to account for this. At least that's the way I am seeing it. Like if you have 4 marbles (red, yellow, blue, green) and you can put each individual marble into one of two bins. If the ordering of marbles was not neglected, it would be a total of 4 x 3 = 12 possible combinations, not 4 x 4=16 b/c you can't have both a red & red or yellow & yellow, etc, right? Thanks in advanced for any more help.:)
It does not matter for large populations - in addition, the exact number of alleles will vary with time anyway (as new animals are born and other animals die), so +- 1 is completely negligible.
New distinct issues concerning small populations do arise in population and evolutionary genetics, but you are not there yet with these problems. People keep reminding you about large populations and you are thinking of a small one. But actually what happens on average is the same for both, that is what happens overall with frequencies in a large number of populations each with a B gene frequency of 6/14 is the same as happens in one large population with that frequency, as long as we don't introduce any new assumptions. Your calculation fails for another reason. You are thinking you have picked the constitution of one chromosome and as the organism doesn't mate with itself the B frequency in the population you have to choose the second chromosome from has a slightly different B frequency. But each chromosome, or each allele, comes from a parent of different sex. In a population at equilibrium the B frequency will be the same in males as females. The most likely way you can have your 6/14 B probability is 3/7 B males and 3/7 B females. You could think about what happens if the 6/14 is with any different distribution than that, e.g the males have all the B genes. Offhand I think you get to an equilibrium in two generations? Or one?
Thanks so much epenguin. It didn't make sense to me before because I had not known that a population at equilibrium should have an equal allele frequency for males and females. Now it's totally understanding why we multiply p by itself. Thanks again for clearing up the confusion.