PeterDonis said:
Again, "estimate ##\lambda##" might not be the right way to express what I was asking in the OP. I did not intend the OP to be interpreted narrowly, but broadly.
Perhaps a better way to broadly express the OP question would be: there is obviously a difference between the two couples, namely, that they used different processes in their child-bearing process. Given that the two data sets they produced are the same, are there any other differences that arise from the difference in their processes, and if so, what are they? (We are assuming, as I have said, that there are no other differences between the couples themselves--in particular, we are assuming that ##\lambda## is the same for both.)
So far I have only one difference that has been described: the p-values are different. Are there others? And what, if any, other implications does the difference in p-values have? Does it mean we should have different posterior beliefs about ##\lambda##?
This probably only makes sense if we allow a second parameter - for example that some couples have a predisposition for children of the one sex. Otherwise, there no reason to doubt the general case.
Unless we allow the second parameter, all we are doing is picking up unlikely events. We can calculate the probability of these events, but unless we allow the second parameter, that is all we can say.
My calculations show that the second family is less likely (more of an anomaly) than the first, but this has no effect on the overall average. Assuming we have enough prior data. Which we do.
What this data does question is the hypothesis that no couples have a predispostion to the one sex or other of their children.
In other words, if a family has ten children, all girls say; then, I don't think this influences the overall mean for girls in general. In fact, even if you adjusted the mean to ##0.6## (which still leaves 10 girls in a row very unlikely), you've created the hypothesis that 60% of children should be girls. Which is absurd. You can't shift the mean from ##0.5## (or whatever it is - I believe it's not quite that) on the basis of one family.
What it does is raise the question about a predisposition to girls in that family. In the extreme case of, say, 50 girls in a row, then
1) That does not affect the overall mean to any extent.
2) It implies that it is almost certain that the data itself could not have come from the assumed distribution. I.e. that family is not producing children on a 50-50 basis.
In summary, to make this a meaningful problem I think you have to add another parameter. Then it reduces to the standard problem where you count the false positives (couples who do produce children 50-50, but who happen to have a lot of one sex) and count the true positives (couples who are genetically more likely to have one sex). Then, you can calculate ##p(A|B)## and ##p(B|A)## etc. (*)
As it stands, to clarify all my posts hitherto, all we can do is calculate how unlikely each of these families is under the hypothesis that in general ##\lambda = 0.5##. Nothing more. Confidence interval calculations cannot be done because of the assumed overwhelming prior data.
(*) PS although we still have to be aware of the sampling pitfalls.
PPS Maybe the Bayesians can do better.