Curl said:
Based on this experiment, what can I say about the red:blue ratio in the population?
Without some assumptions about the population, you can't say anything except for the trivial fact that the population contains at least 50 red and 75 blue cards and so that establishes some bounds for the ratio.
Other posters have given you methods to produce some numbers. The methods work by assuming information about the population. The numbers they produce are often misinterpreted by laymen. I'll focus my post on the conceptual aspects.
Two divisions of statistics are "hypothesis testing" and "estimation".
Hypothesis Testing
Typical statistical "hypothesis" testing involves making a specific enough assumption about the population to compute the probability distribution for some statistic of the sample. For example, if the statistic is the ratio of red to blue cards in the sample, the assumption that the the population has the same number of red cards as blue cards is specific enough to let you compute the probability distribution of this ratio in samples. The assumption that the ratio is *not* 1:1 in the population is not specific enough to let you compute the distribution of that statistic.
Hypothesis testing is a procedure. You make a sufficiently specific assumption (a "null hypothesis") to know the probability distribution of some statistic. You define an "acceptance region" for the statistic. If the statistic computed from the observed data falls within the "acceptance region" you "accept" the hypothesis. Otherwise you "reject it". The quantitative behavior of the procedure is specified by the probability that the statistic would fall outside of the acceptance region if the null hypothesis were true.
(i.e. that the hypothesis testing would make the wrong decision if the null hypothesis were true.)
Hypothesis testing isn't a proof of something and it does not find the probability that the null hypothesis is true or the probability that it is false. Hypothesis testing is just a
procedure that has been found to be empirically useful in many real life situations.
Estimation
Estimation refers to using some function of the sample data to estimate a parameter of the distribution of the population.
The technical definition of "confidence" refers to the scenario of "estimation" , not to "hypothesis testing". The numerical calculations in computing confidence are often the same as those used in hypothesis testing, but the interpretation of the numbers is different.
An empirical version of "confidence" in estimation is illustrated by the following:
Imagine there is a lab, to which you send samples. The lab reports an estimate of some property of the sample ( e.g. its mass). The report is given as an interval (e.g. 9.75 to 10.25 milligrams). If you have a way of doing more precise measurements on the same sample that determine its "true" mass, you can note whether the true mass is within the interval reported by the lab. By accumulating data on how often the lab was correct, you can quantify your "confidence" in the lab. If the interval reported by the lab contains the true mass of the sample 95% of the time, you can say that the lab gives a "95% confidence" interval for the true mass.
It's important to note that you cannot apply this "confidence" number to one particular lab report. For example if the lab reports an interval of "100.5 to 102.0 grams" , you cannot assert that there is a 0.95 probability that the true mass of that sample is in the interval 100.5 to 102.0 grams. For example, suppose the lab uses different measuring instruments on small samples and large samples. One of their instruments might be more reliable than the other. The 0.95 probability is not based on analyzing the behavior of the lab in enough detail to account for such a situation. It is only based on data about how often the lab was correct or incorrect.
A typical statistical version of "confidence" is analogous to the above example. You assume the population comes from a specific family of distributions (e.g. a binomial, or a gaussian). You pick a particular algorithm that computes an estimate of one of the parameters of the distribution in the form of an interval. You compute the probability that the algorithm produces an interval containing the true value of the parameter. This probability is the "confidence" associated with the estimate. (It is often possible to compute this probability only knowing the family of distributions that are involved. You don't need to assume a specific numbers for the true value of the population's distribution parameters.)
Just as in the empirical example, if you are using an algorithm that produces 95% confidence intervals, then you cannot claim that there is a 95% probability the the true value of the parameter is in one particular interval. For example if you are using a algorithm that works with 95% confidence to estimate the ratio of red to black cards and the algorithm produces the interval ( 0.47, 0.49) from your sample data, you can't claim that there is a 0.95 probability that population ratio is in that interval.
Math problems involving the same formulas can be posed in various ways, by changing which values are "given" and which are solved for. The common way to pose a "confidence interval" problem is state the estimation algorithm and the desired confidence as givens (e.g. 95%) and to solve for the number of samples needed to produce intervals that will give the estimate that level of confidence. That's the approach other posters have suggested.
Bayesian Statistics
The above methods are those of "frequentist" statistics, the type of statistics taught in most introductory courses. Essentially, frequentist statistics only tells you numbers that characterize the probability of the data given some assumption about the population distribution. It doesn't tell you the probability that some fact about the population is true given the observed data. (There is a difference in meaning between Pr(A|B) and Pr(B|A) and the two need not be numerically equal.)
If you want to solve for something like "The probability that the ratio of red to black cards in the population is in the interval (0.47, 0.49) given the observed data" then you have to assume a scenario where there is something probabilistic about how the population ratio came into being. If you don't assume such a scenario, there isn't enough given information to solve for such a probability.
Bayesian Statistics involves making assumptions about how the population parameters were selected from some distribution, called the "prior distribution". If you want to compute the answer to the question in the previous paragraph, you'll have to use Bayesian statistics.