What is Statistics: Definition and 997 Discussions

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation.
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
A standard statistical procedure involves the collection of data leading to test of the relationship between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative"). Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis. Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.

View More On Wikipedia.org
  1. Artemisa

    Error floors in this Bayesian analysis

    In this article((https://arxiv.org/pdf/2001.04581.pdf)), the authors use a Bayesian analysis based on the positions of astrophysical bodies and their errors in the medians. This statistical analysis uses the markov chain monte carlo chains. The uncertainties in the positions are large, so what...
  2. hagopbul

    About meta trader platform

    TL;DR Summary: Asking about meta trader platform and what mathematical theories should i read about Hello : Recently got my attention a claim about meta trader platform and how you can use it as supportive income source What is this platform exactly ? What should I read to be able to use...
  3. Graham87

    I Basic standard deviation calculation

    I don’t get how they got the equation for the standard deviation. Why do they only square with the time in the denominator? Thanks!
  4. H

    Good introductory book on statistical/data analysis?

    TL;DR Summary: I'm looking for a book on statistical/data analysis. Hey all. I've been doing statistical analysis in my research (such as using PCA and LDA), but I have never received a formal education on statistical analysis or data mining, and what I know about analysis is quite scattered...
  5. P

    Understanding the meaning of "expected fraction" (Statistics)

    The first part of the question asked me to calculate the mean and standard deviation for the number of remain votes in the simple binomial model consisting of total sample size of 2091 people. I believe this is fairly straightforward, it was simply ##E(X) = \mu = 2091(0.5) = 1045.5## votes and...
  6. S

    Probability of Hypokalemia w/ 1 or Multiple Measurements

    TL;DR Summary: Finding the probability with one measurement and multiple measurements on separate days. Question: Hypokalemia is diagnosed when blood potassium levels are low, below 3.5 mmol/L. Let’s assume we know a patient whose measured potassium levels vary daily according to N(µ = 3.8...
  7. A

    Break a Stick Example: Random Variables

    Hello, I would like to confirm my answers to the following random variables question. Would anyone be willing to provide feedback and see if I'm on the right track? Thank you in advance. My attempt:
  8. shahbaznihal

    A Computing the Fisher Matrix numerically

    Hi, I have been studying the Fisher matrix to apply in a project. I understand how to compute a fisher matrix when you have a simple model for example which is linear in the model parameters (in that case the derivatives of the model with respect to the parameters are independent of the...
  9. P

    I Are Boltzman's statistics compatible with a deterministic universe?

    Are Boltzman's statistics compatible with deterministic universe? Suppose that the gas molecules in a given container are perfectly elastic objects obeying Newton's laws. Suppose further that we select the initial conditions (impulse and position of each molecule) at random. Is it true that, if...
  10. A

    A How to derive the sampling distribution of some statistics

    Assume that ##T## has an Erlang distribution: $$\displaystyle f \left(t \, | \, k \right)=\frac{\lambda ^{k }~t ^{k -1}~e^{-\lambda ~t }}{\left(k -1\right)!}$$ and ##K## has a geometric distribution $$\displaystyle P \left( K=k \right) \, = \, \left( 1-p \right) ^{k-1}p$$ Then the compound...
  11. WMDhamnekar

    MHB Probability, Expected value, joint P.D.F. and order statistics

    I want to know how did author derive the red underlined term in the below given Example? Would any member of Math help board enlighten me in this regard? Any math help will be accepted.
  12. L

    [Statistics] Calculate the percentage

    My attempt: P(x>=90) = 85/90 = 17/18 Is my understanding of the equation correct? Thanks
  13. A

    I Modeling the concentration of gas constituents in a Force Field

    Say there is a gas made up of two gas molecules: Molecule A and Molecule B. Molecule A has a mass: ma and mole fraction: na. Molecule B has a mass: mb and mole fraction: nb. The gas is at thermal equilibrium and has a constant temperature throughout itself (T) everywhere. It is placed in a...
  14. D

    Ancillary statistics

    Ancillary statistics! You don't know what this means? I didn't know either, so I looked it up: http://utstat.toronto.edu/reid/research/A20n41.pdf As a non-native speaker, I didn't even know what "ancillary" means, so I had to look it up, too. The word has its root in latin "ancilla" which is...
  15. A

    Calculus Advanced Calculus with Applications in Statistics

    Is someone has already heard about this book wrote by Andre I. Khuri (Professor emeritus in science at university of Florida) ? By the table of contents the book seems to cover a lot of things in calculus/multivariable calculus and in a rigourous way according to the preface (they argue that...
  16. S

    I Bayesian statistics in science

    [Moderator's note: This thread has been split off from a previous thread since its topic is best addressed in a separate discussion. This post has been edited to focus on the topic for separate discussion.] Jaynes has used in the derivation of the rules of probability as the logic of plausible...
  17. chwala

    Solve the variance problem below - statistics

    The question is below: below is my own working; the mark scheme for the question is below here; i am seeking for any other approach that may be there...am now trying to refresh on stats...bingo!
  18. tixi

    Labwork Statistics help: Average of averages

    I have done the experiment, and have a lot of data. For each data point (we have five), we did ten repetitions, for which we need to do video analysis. The analysis works frame by frame and gives a velocity between each frame. So, to get the value of one repetition, we already need to calculate...
  19. Amitkumarr

    I Finding bias of the coin from noise corrupted signals

    Suppose there are two persons A and B such that both have a personal communication system which can transmit and receive bits. B has a biased coin whose bias is not known. A asks B to toss the coin 2000 times, send a 0 when a tail comes up and a 1 when a head comes up. It is known that whatever...
  20. V

    B Convince Covid-19 Vaccine Efficiency Through Statistics

    I have been trying to convince someone that it is wrong to compare the death percentages of two different populations (percentage of death of Covid-19 cases per category: vaccinated vs unvaccinated) in an uncontrolled setting (i.e. real-world data), and conclude that the Covid-19 vaccine does...
  21. ohwilleke

    I Why Do Physicists Use Gaussian Error Distributions?

    David C. Bailey. "Not Normal: the uncertainties of scientific measurements." Royal Society Open 4(1) Science 160600 (2017). How bad are the tails? According to Bailey in an interview, "The chance of large differences does not fall off exponentially as you'd expect in a normal bell curve," and...
  22. S

    Stock trading volume statistics

    Has the advent of computer trading greatly increased the size of statistics for trading volume? - or do those statistics (for individual stocks) somehow omit the flash trades done by computers? In the pre-computer days, there were people who had theories of stock trading based on both the...
  23. W

    A Using Statistics to Test for Normality of Pi

    Is there a " reasonable" way to test for the normality of ##\pi## , i .e., that every digit occurs with the same frequency? Someone suggested randomly sampling strings of size 20 and outputting the frequency. Then I guess we could average the frequencies among samples , use a chi-squared test...
  24. Falgun

    Prob/Stats Looking for a probability and statistics textbook

    I want to learn some probability & statistics on my own. I am well versed in Calc 1-3 , elementary ODEs and very little linear algebra. I want a comprehensive , introductory textbook which is NOT COOKBOOK STYLE. I might be self studying AP statistics next term so if the book covers everything I...
  25. shahbaznihal

    A Galaxy statistics calculation in Saslaw's book

    I am trying to follow a calculation from the book of William C. Saslaw, The Distribution of the Galaxies: Gravitational Clustering in Cosmology. The calculation is shown on the pages following page 122 in chapter 14 where the author talks about the Correlation function. I am able to reproduce...
  26. chwala

    Discrete data vs continous data in statistics

    I would like to seek your take on the two terms; discrete and continuous in this context, In my understanding, when we look at height of individuals (in cms), this measure in general or in definition implies continuous data. If we are to look at specific math problem that involves height of say...
  27. W

    I Bias in Linear Regression (x-intercept) vs Statistics

    Hi, In simple regression for machine learning , a model : Y=mx +b , Is said AFAIK, to have bias equal to b. Is there a relation between the use of bias here and the use of bias in terms of estimators for population parameters, i.e., the bias of an estimator P^ for a population parameter P is...
  28. L

    Statistics: Verifying a Probability Proof

  29. L

    How to Start a Problem I'm Struggling With

    I really don't know what to do for this problem. I looked at similar threads but couldn't seem to grasp the idea of it. I would like help on how to start.
  30. V

    MHB Creating An Awesome Statistics Course For Students

    Hi everybody, my name is Vaughny. I was once a statistics tutor, and I loved making statistics easier to understand for those who struggle learning the material. I've seen students in high school and college go through a painful experience of not having enough resources or just having horrible...
  31. S

    Question about hint given by the problem related to statistics

    I want to ask about the "problem-solving" box on the right. I don't understand why the class boundaries for the 16 - 25 group are 16 and 26. If I try to find it using my usual way, it will be 15.5 and 25.5 and the midpoint will be (15.5 + 25.5) / 2 = 20.5 Or if I use the hint "since age is...
  32. S

    MHB Statistics and probability

    The weight of goats at a farm is normally distributed with a mean of 60 kg and a standard deviation of 10 kg. A truck used to transport goats can only accommodate not more than 650 kg. If 10 goats are selected at random from the population, what is the probability that the total weight exceeds...
  33. Athenian

    Finding the Relative Uncertainty for the Standard Error of the Mean

    While I will not be showing the graph here, I am trying to dissect what the question even means. While I do understand that relative uncertainty can be found via the equation ##\frac{\sigma_A}{A}##, I do not understand how I can find the "relative uncertainty of SEM". Does anybody here have any...
  34. CPW

    Encouraging fact from cancer statistics

    We study cancer to get better at killing it. And here is an enouraging detail: Since 1975, the cancer death rate in the United States has decreased by 21.9% with a 15% decrease from 2007 to 2017. (https://seer.cancer.gov/csr/1975_2017)
  35. Dale

    Insights Posterior Predictive Distributions in Bayesian Statistics

    Continue reading...
  36. C

    I Large Q^2 statistics at different colliders

    How tractable is it experimentally to measure deeply virtual compton scattering in bins of large Q^2, where Q^2 is the virtuality of the incoming photon, at e.g. Jefferson Lab which collides electron and proton? I know at LHC, colliding proton-proton, such processes would instead be statistics...
  37. dx

    A Photon Statistics

    In a given mode with an average number of photons ``##\bar{n}##, the photons are distributed around their average according to the formula $$p_n = e^{-\bar{n}} \frac{\bar{n}^n}{n!}$$ The justification of this formula in quantum field theory involves considering field operators acting on a...
  38. D

    Rotational partition function for CO2 molecule

    Hello fellow physicists, I need to calculate the rotational partition function for a CO2 molecule. I'm running into problems because I've found examples were they say this rotational partition function is: ##\zeta^r= \frac T {\sigma \theta_r} = \frac {2IkT} {\sigma \hbar^3}## Where...
  39. iVenky

    I What are the statistics of probability of dying today vs age?

    I don't intend to sound macabre, but I was having this thought if I have to quantify the probability of someone dying given his age (in days) how would I go about quantifying that with a minimal accuracy (ok if it's not accurate but I just need some number with days). Has anyone ever worked out...
  40. Dale

    Insights How to Get Started with Bayesian Statistics

    Continue reading...
  41. R

    What statistics services/platforms do you use that help you in life?

    I encountered a website with statistics for a large number of video games, specifically regarding their availability across various platforms, their sales over time and some other things and methods to visualize them I found this really helpful. Might there be other services like this that...
  42. E

    Online course for probability and statistics with emphasis on python

    I have been looking for a way to learn probability and statistics online and have searched but found nothing yet. I am looking for a course on probability and statistics that will not only teach me the basics but all there is to know about the subject. I would love it if the visualizations are...
  43. I

    B Different sample methods in statistics

    What is the difference between stratified and quota sampling of a population? For example, you can choose 200 males and 200 females from a state by quota sampling; or collect raw data first, stratify it, and then choose 200 males from one subsample and 200 females from the other subsample...
  44. G

    Confidence Interval Question help please

    Here is the question I'm struggling with (Q1) : I just... I just don't understand what my first step is. Whats my barx1 and barx2? (bar x = mean, x1 = subscript 1) My thoughts on approaching this question : barX1 - barX2 `~ N(u1-u2, sd1^2/n1 + sd2^2/n2) Find Z value when p = 0.975, z = + or...
  45. G

    B Statistics Help : Hypothesis Testing

    Answer : I understnad why x(< or = ) 2 but I do not understand why we use 16 instead of 17 for the second range? When P(X>=16) > 0.005(which is the level of significance). Thank you for all the help given :)
  46. archaic

    Prob/Stats Introductory textbook for Probability and Statistics

    Hello! I'll be taking a probability and statistics course this semester. Does anyone know of any good textbook? I have access to an extensive catalogue of books on springer, so it would be extremely preferable for me if you could recommend something from there. Thanks.
  47. T

    Exploring the Grand Partition Function for an Einstein Solid

    $$Q_{(\alpha, \beta)} = \sum_{N=0}^{\infty} e^{\alpha N} Z_{N}(\alpha, \beta) \hspace{1cm} (3.127)$$ Where ##Q## is the grand partition function, ##Z_N## is the canonical partition function and: $$\beta = \frac{1}{kT} \hspace{1cm} \alpha = \frac{\mu}{kT} \hspace{1cm} (3.128)$$ In the case of an...
  48. Schwann

    I 'Conservative' p-values adjusted

    Hello everyone! Could anybody recommend some strategy of p-values adjustment, as the distribution of my p-values indicates the presence of a big number of false negatives? Usually p-values are adjusted in order to overcome Type 1 errors (e. g. FDR or FWER estimation), but what I need to do is...
  49. SamRoss

    I Seeking better explanation of some quantum stats formulae

    In "Introduction to Quantum Mechanics", Griffiths derives the following formulae for counting the number of configurations for N particles. Distinguishable particles... $$ N!\prod_{n=1}^\infty \frac {d^{N_n}_n} {N_n !} $$ Fermions... $$ \prod_{n=1}^\infty \frac {d_n!} {N_n!(d_n-N_n)!}$$...
  50. michaelwright

    B Fun with (im)probabilities

    Hi folks - I need some help with a tricky probability. Here's the situation: Let's say there are 4M internet users in Age Group A. (The total set) Of those 4M, there are 1,000 users who play a specific sport. Those 1,000 are spread evenly over 125 teams, so 8 players each. 1. What's the...