Statistics Definition and 181 Discussions

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation.
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
A standard statistical procedure involves the collection of data leading to test of the relationship between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative"). Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis. Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.

View More On
  1. A

    I Modeling the concentration of gas constituents in a Force Field

    Say there is a gas made up of two gas molecules: Molecule A and Molecule B. Molecule A has a mass: ma and mole fraction: na. Molecule B has a mass: mb and mole fraction: nb. The gas is at thermal equilibrium and has a constant temperature throughout itself (T) everywhere. It is placed in a...
  2. A

    Calculus Advanced Calculus with Applications in Statistics

    Is someone has already heard about this book wrote by Andre I. Khuri (Professor emeritus in science at university of Florida) ? By the table of contents the book seems to cover a lot of things in calculus/multivariable calculus and in a rigourous way according to the preface (they argue that...
  3. tixi

    Labwork Statistics help: Average of averages

    I have done the experiment, and have a lot of data. For each data point (we have five), we did ten repetitions, for which we need to do video analysis. The analysis works frame by frame and gives a velocity between each frame. So, to get the value of one repetition, we already need to calculate...
  4. Amitkumarr

    I Finding bias of the coin from noise corrupted signals

    Suppose there are two persons A and B such that both have a personal communication system which can transmit and receive bits. B has a biased coin whose bias is not known. A asks B to toss the coin 2000 times, send a 0 when a tail comes up and a 1 when a head comes up. It is known that whatever...
  5. V

    B Convince Covid-19 Vaccine Efficiency Through Statistics

    I have been trying to convince someone that it is wrong to compare the death percentages of two different populations (percentage of death of covid-19 cases per category: vaccinated vs unvaccinated) in an uncontrolled setting (i.e. real-world data), and conclude that the covid-19 vaccine does...
  6. ohwilleke

    I Why Do Physicists Use Gaussian Error Distributions?

    David C. Bailey. "Not Normal: the uncertainties of scientific measurements." Royal Society Open 4(1) Science 160600 (2017). How bad are the tails? According to Bailey in an interview, "The chance of large differences does not fall off exponentially as you'd expect in a normal bell curve," and...
  7. Falgun

    Prob/Stats Looking for a probability and statistics textbook

    I want to learn some probability & statistics on my own. I am well versed in Calc 1-3 , elementary ODEs and very little linear algebra. I want a comprehensive , introductory textbook which is NOT COOKBOOK STYLE. I might be self studying AP statistics next term so if the book covers everything I...
  8. L

    Statistics: Verifying a Probability Proof

  9. L

    Statistics: Prove following theorem by expressing all the binomial coefficients in terms of factorials

    I really don't know what to do for this problem. I looked at similar threads but couldn't seem to grasp the idea of it. I would like help on how to start.
  10. Athenian

    Finding the Relative Uncertainty for the Standard Error of the Mean

    While I will not be showing the graph here, I am trying to dissect what the question even means. While I do understand that relative uncertainty can be found via the equation ##\frac{\sigma_A}{A}##, I do not understand how I can find the "relative uncertainty of SEM". Does anybody here have any...
  11. D

    Rotational partition function for CO2 molecule

    Hello fellow physicists, I need to calculate the rotational partition function for a CO2 molecule. I'm running into problems because I've found examples were they say this rotational partition function is: ##\zeta^r= \frac T {\sigma \theta_r} = \frac {2IkT} {\sigma \hbar^3}## Where...
  12. G

    Confidence Interval Question help please

    Here is the question I'm struggling with (Q1) : I just... I just don't understand what my first step is. Whats my barx1 and barx2? (bar x = mean, x1 = subscript 1) My thoughts on approaching this question : barX1 - barX2 `~ N(u1-u2, sd1^2/n1 + sd2^2/n2) Find Z value when p = 0.975, z = + or...
  13. T

    Calculate the average number of oscillators of an Einstein solid, in the grand canonical ensemble, when q>>N

    $$Q_{(\alpha, \beta)} = \sum_{N=0}^{\infty} e^{\alpha N} Z_{N}(\alpha, \beta) \hspace{1cm} (3.127)$$ Where ##Q## is the grand partition function, ##Z_N## is the canonical partition function and: $$\beta = \frac{1}{kT} \hspace{1cm} \alpha = \frac{\mu}{kT} \hspace{1cm} (3.128)$$ In the case of an...
  14. Schwann

    I 'Conservative' p-values adjusted

    Hello everyone! Could anybody recommend some strategy of p-values adjustment, as the distribution of my p-values indicates the presence of a big number of false negatives? Usually p-values are adjusted in order to overcome Type 1 errors (e. g. FDR or FWER estimation), but what I need to do is...
  15. SamRoss

    I Seeking better explanation of some quantum stats formulae

    In "Introduction to Quantum Mechanics", Griffiths derives the following formulae for counting the number of configurations for N particles. Distinguishable particles... $$ N!\prod_{n=1}^\infty \frac {d^{N_n}_n} {N_n !} $$ Fermions... $$ \prod_{n=1}^\infty \frac {d_n!} {N_n!(d_n-N_n)!}$$...
  16. michaelwright

    B Fun with (im)probabilities

    Hi folks - I need some help with a tricky probability. Here's the situation: Let's say there are 4M internet users in Age Group A. (The total set) Of those 4M, there are 1,000 users who play a specific sport. Those 1,000 are spread evenly over 125 teams, so 8 players each. 1. What's the...
  17. The Parker Machine: it's 80% accurate.

    The Parker Machine: it's 80% accurate.

    Check out the full lecture on the Royal Institution YouTube channel.
  18. Cesca Roma

    I Discriminant function analysis - stepwise or otherwise?

    I’m using discriminant function analysis to determine the potential accuracy of several biometric measurements being used in conjunction for binary classification purposes for my BSc Biomed research project. Overall I've only got 110 data points so it's a stretch but hey, that's anatomy! What...
  19. BlueKaiza

    Interpreting pie and bar charts

    Working 1. a (i) 1/4=×5% (ii)120/360×100=33.3% (b) newspaper=£9000 Leaf=£6500 Transport=£19000 55000-34500=£20500 Tv=16.7% 0.167×55000=£9185 ( answer) (c) Tv=16.7% News=25% Leaf=15.3% Other=9.72% Trans=33.3%(answer) Note: they were rounded to (3sf)
  20. The Datasaurus Dozen

    The Datasaurus Dozen

    Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing
  21. JorgeM

    I Is this weighted mean and standard deviation correct?

    The expression I have found is this one. I have been looking for information because I could not to realize what is the value that "alpha" has to have. If any of you do know what this alpha value is supposed to represent or if you have seen it before I would be really...
  22. parazit

    I Comparing theoretical calculations with experimental data

    Dear users, The situation I have encountered is a simple statistical comparison of the experimental data, which accepted as correct, with the results obtained via six theoretical models. In the experimental data, there exist y values corresponding to x values and also the measurement errors of...
  23. Biochemgirl2002

    How do i answer this permutation question?

    Question: A home security device with 10 buttons is disarmed when three different buttons are pushed in the proper sequence. (No button can be pushed twice.) If the correct code is forgotten, what is the probability of disarming this device? My attempt: 10!/(10-3)! =(...
  24. P-values Broke Scientific Statistics—Can We Fix Them?

    P-values Broke Scientific Statistics—Can We Fix Them?

  25. B

    I Finding CDF given boundary conditions (simple stats and calc)

    I'm not quite sure if my problem is considered a calculus problem or a statistics problem, but I believe it to be a statistics related problem. Below is a screenshot of what I'm dealing with. For a) I expressed f(t) in terms of parameters p and u, and I got: $$f(t)=\frac{-u \cdot a + u \cdot...
  26. mPlummers

    Relative error and measurement precision

    NOTE: this is a programming exercise (Python). I started adding to ##x_{true}## an error related to a (for example) 10% relative error, obtaining ##x_{measurement}##. Then i computed ##y_{measurement}##. To find the precision, i calculated ##(y_{true}-y_{measurement})/y_{measurement}##. If it is...
  27. olgerm

    I Standard deviation question -- population std vs sample std

    I know that standard deviation of whatever data is defined as sqaure root of square difference from mean value: ##\sigma(data)=\frac{\sum_{x \in data}((x-x_{mean\ of\ data})^2)}{|data|}=\frac{\sum_{x \in data}((x-\sum_{y \in data}(y)/|data|)^2)}{|data|}## but sometimes formula...
  28. user366312

    Finding conditional and joint probabilities from a table of data

    Let, alpha <- c(1, 1) / 2 mat <- matrix(c(1 / 2, 0, 1 / 2, 1), nrow = 2, ncol = 2) chainSim <- function(alpha, mat, n) { out <- numeric(n) out[1] <- sample(1:2, 1, prob = alpha) for(i in 2:n) out[i] <- sample(1:2, 1, prob = mat[out[i - 1], ])...
  29. Badgun

    Impact of several variables on resulting projectile motion trajectory

    I was told to generate these variables (m, C, alpha, wind velocity) normally distributed and compare the random data with the result and then tell, which of the variables has the most impact. Here I am stuck, tried to compare variances, kurtosis and skewness of the data (the original variables...
  30. caters

    Are my calculations of the pregnancy ratio of the population correct?

    Monthly Cycle numbers Here is the cycle ratio: $$2_{early}:2_{fertile}:1_{late}$$ And the numbers: $$20,000_{early}:20,000_{fertile}:10,000_{late}$$ Now, let's divide the early into 2 groups, pre-fertile, and safe and assume there is a 50/50 split between those 2 groups. Let's also assume...
  31. J

    Chi-squared test for normality

    Homework Statement Hello, I was given 2 sets of data, showing 20 temperature values and 35 temperature values respectively. The data sets look like below: Data 1 Data 2 Temperature Temperature 30.9...
  32. iVenky

    Power of noise after passing through a system h(t)

    **Reposting this again, as I was asked to post this on a homework forum** 1. Homework Statement Hi, I am trying to solve this math equation (that I found on a paper) on finding the variance of a noise after passing through an LTI system whose impulse response is h(t) X(t) is the input noise...
  33. J

    Math/Statistics PHD Application

    I Know this is prob the wrong site to post this but... Hello, I am a student at a low-ranked college in New York State actively pursuing a bachelors (BA) in Math in my junior year. I have a 3.7 GPA overall and a 3.73 in Math. I am looking to apply to PHD programs next year in Statistics or in...
  34. iVenky

    I White noise & 1/f noise after a system h(t)

    Hi, I am trying to solve this math equation on finding the variance of a noise after passing through a system whose impulse response is h(t) X is the input noise of the system and Y is the output noise after system h(t) if let's say variance of noise Y is σy2=∫∫Rxx(u,v)h(u)h(v)dudv where...
  35. J

    B How to Present Statistical Data

    <Moderator's note: Moved from a homework forum.> Mass (g) +/- 0.01 grams Drop height (centimeters) +/- 3.00 Shell 53.47 45 No crack 56.78 45 Cracked...
  36. CivilSigma

    Characteristic Function Integrand Evaluation

    Homework Statement [/B] I am trying to determien the characteristic function of the function: $$ f(x)= ae^{-ax}$$ $$\therefore E(e^{itx}) =\int_0^\infty e^{itx}ae^{-ax} dx = a \cdot \frac{e}{it-a} |_0 ^ \infty $$ But I am not sure how to evaluate the integral. Wolfram alpha suggests this...
  37. JackLee

    How to read a joint discrete table?

    Homework Statement [/B] Given a group of 100 married couples, let X1 be the number of sons and X2 the number of daughters the couple has. P(X1 = 0, X2 = 2) = f(0, 2) = 8 /100 = 0.08 2. Homework Equations The Attempt at a Solution I tried to look for a similar example online, I found this...
  38. T

    Showing Rejection Region Equality with Fisher Distribution

    Homework Statement [/B] For reference: Book: Mathematical Statistics with Applications, 7th Ed., by Wackerly, Mendenhall, and Scheaffer. Problem: 10.81 From two normal populations with respective variances ##\sigma_1^2## and ##\sigma_2^2##, we observe independent sample variances ##S_1^2## and...
  39. R

    I Principal component analysis (PCA) coefficients

    I am trying to use PCA to classify various spectra. I measured several samples to get an estimate of the population standard deviation (here I've shown only 7 measurements): I combined all these data into a matrix where each measurement corresponded to a column. I then used the pca(...)...
  40. Dewgale

    Anti-commutation of Dirac Spinor and Gamma-5

    Homework Statement Given an interaction Lagrangian $$ \mathcal{L}_{int} = \lambda \phi \bar{\psi} \gamma^5 \psi,$$ where ##\psi## are Dirac spinors, and ##\phi## is a bosonic pseudoscalar, I've been asked to find the second order scattering amplitude for ##\psi\psi \to \psi\psi## scattering...
  41. Mathman2013

    Median vs. Second Quartile question

    Homework Statement Lets say I have a list of numbers. income=[17000, 11000, 23000, 19999, 21000, 10000] I sort them income_sorted=[10000, 11000, 17000, 19999, 21000, 23000] Calculate med 2nd Quartile. Homework Equations Median_formula = (n+1)/2 The Attempt at a Solution The second...
  42. T

    Conditional Expectations of 2 Variables

    Homework Statement Suppose that the number of eggs laid by a certain insect has a Poisson distribution with mean ##\lambda##. The probability that any one egg hatches is ##p##. Assume that the eggs hatch independently of one another. Find the expected value of ##Y##, the total number of eggs...
  43. T

    Expected bounds of a continuous bi-variate distribution

    Homework Statement [/B] ##-1\leq\alpha\leq 1## ##f(y_1,y_2)=[1-\alpha\{(1-2e^{-y_1})(1-2e^{-y_2})\}]e^{-y_1-y_2}, 0\leq y_1, 0\leq y_2## and ##0## otherwise. Find ##V(Y_1-Y_2)##. Within what limits would you expect ##Y_1-Y_2## to fall? Homework Equations N/A The Attempt at a Solution...
  44. E

    Calculations using Standard Deviation and Mean

    Homework Statement Homework Equations Chebyshev's Theorem: The percentage of observations that are within k standard deviations of the mean is at least 100(1 - (1/k2))% Chebyshev's Theorem is applicable to ANY data set, whether skewed or symmetrical. Empirical Rule: For a symmetrical...
  45. T

    Other Selecting a career with my BS in Applied Mathematics, BS in Physics and minor in Computer Science

    This may be better suited in the academic forum, or possibly not even the normal type of question asked, but I was just judging based on other similar posts. I just graduated from college this past spring with a BS in Applied Mathematics and a BS in Physics, as well as a minor in computer...
  46. S

    Fortran Getting my method of bisection to output the correct value

    I am trying to write a program that calculates the root of chi-square. I am not getting the correct answer and I honestly am at my wits end trying to figure it out. I know my simp_p() method is returning the correct value, but for some reason my root_chisq() method is not giving me the correct...
  47. G

    Random walk - why is the STD equal sqrt(n)

    typical random walk : one step forward or backward with equal probability and independence of each step , what is the expectation and Variance . so i define indicator variable xi ={1 or -1 with equal probabilty . E(xi) = 0 Var(xi) = 1 now define Sn as the sum of i=1,...,n each step is...
  48. maajdl

    I Any database of results for the Michelson-Morley Experiment

    Hello, I would be interested in a collection of experimental data for the Michelson-Morley Experiment . I would like to see if there would be many data available, and if a statistical analysis could be of some fun. Would you know some compilation of data? Thanks, Michel
  49. P

    Probability density function

    Homework Statement Hello! I'm trying to understand how to solve the following type of problems. 1) Random variables x and y are independent and uniformly distributed on the interval [0; a]. Find probability density function of a random variable z=x-y. 2) Exponentially distributed (p=exp(-x)...
  50. H

    [Poisson Stats] Error on half-life for radioactive decay

    Hi there, not sure whether this is in the right section but: I've made two runs of a radioactive decay experiment where I've got a log(N) vs. time plots. From this I've got the decay constants and hence the half-life. I've averaged these two half-lives ( = 160 secs) and now I'm trying to work...