What is Statistics: Definition and 998 Discussions
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation.
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
A standard statistical procedure involves the collection of data leading to test of the relationship between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative"). Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis. Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.
Hi,
I have completed an experiment at university as part of my internship and have now received several measurement results which I would like to analyze statistically and plot the results as a normal distribution in Mathematica. Is this even possible with Mathematica? Unfortunately, I haven't...
Hello.
So I got a question about heredity .
Let's say the probability of inheriting schizophrenia is 6 % if one parent is affected.
So i know that for 6 % probability, there is 1.2 kid out of 5 who will inherit that illness .
So is it better not to have kids in this case ?
This is a homework question in my daughter’s maths class. When I did stats I always had examples where there were two variables : an example being swallow wingspan and sex. The statement that males have larger wingspan would therefore be the H_1 and the H_0 would be that there is no impact of...
A study on strength properties of high-performance concrete obtained by using super-plasticizers and certain binders recorded the following data on flexural strength (in mega-pascals, MPa) from 28 tests:
6.1, 5.6, 7.1, 7.3, 6.6, 8.0, 6.8, 6.6, 7.6, 6.8, 6.7, 6.6, 6.8, 7.6, 9.3, 8.2, 8.7, 7.7...
I studied foundations of mathematics from mathematical and philosophical angles in grad school but then went on to a career of building and testing statistical risk models. The guiding philosophy there, which I call Boxian Skepticism, derives from a quote of George Box: "All models are wrong but...
Does anyone know of a bivariate smoothing spline package that lets you set your own loss function? All of the public domain software I've been able to find (e.g., SCIPY) appears to minimize the sum of squared errors. For example, I'd like to set the spline coefficients to maximize the...
Looking at stats today,
In my working i have;
Let
##H_0 = μ_1=μ_2##
v/s
##H_1 = μ_1-μ_2≠ 0##
then,
##\bar x = \dfrac{134+83+...+123}{12}=120##
##\bar y = \dfrac{70+118...+94}{7}=101##
##t=\dfrac{\bar x- \bar y}{S_p ⋅\sqrt {\dfrac{1}{n_1}+\dfrac{1}{n_2}}}##
##t=\dfrac{120-101}{21.21...
One possible end to the Universe is called vacuum decay, where a Higgs boson could transition from a false vacuum to a true vacuum state. This would create a vacuum decay bubble (known as bubble nucleation) that would expand at light speed, destroying everything in its path.
According to Anders...
A variation of the Liar's Paradox occurred to me: "Statistics are wrong 90% of the time". This statement seems to refute itself, but does so in a less straightforward way. I would appreciate any insights! And what about, "Statistics are wrong 50% of the time"? (Even odds.)
I know two programs that claim to be able to detect whether a text has been written by a machine or by a human.
A (ZeroGPT): https://www.zerogpt.com/
B (OpenAI): https://openai-openai-detector.hf.space/
Character Count: https://www.lettercount.com/
If you have time and examples, please test...
In a line of reasoning that involves measurement outcomes in quantum mechanics, such as spins, photons hitting a detection screen (with discrete positions, like in a CCD), atomic decays (like in a Geiger detector counting at discrete time intervals, etc.), I would like to define rigorously the...
I want to compare performance on written work under different conditions, for example with and without the use of AI, according to some specified criteria. Assume the written work is a critical analysis of specific content.
The written work will be scored on a number of dimensions, such as...
Mentor note: Thread moved from technical section to here, so is missing the homework template.
TL;DR Summary: The weight of DYL 3-blood hybrid pigs after correction of a farm is a random quantity with a normal distribution. Knowing that the probability of a pig weighing over 20 kg is 0.1587 and...
Hello, I've been working with MCNP on and off for a few years now, but just recently realized that I don't entirely understand how tallies are actually calculated in MCNP, and what they signify.
Taking the example of the F2 tally, the user manual (Section 3.3.5.1) states that F2 is the "flux...
For concretness I'll use atoms and photons but this problem is actually just about probabilities.
There's an atom A whose probability to emit a photon between times t and t+dt is given by a gaussian distribution probability P_A centered around time T_A with variance V_A. There's a similar atom...
Post-grad, my background is in mathematical physics, probability/statistics, and information theory. I am here for discussion and collaboration on things I find interesting from time to time.
In this article((https://arxiv.org/pdf/2001.04581.pdf)), the authors use a Bayesian analysis based on the positions of astrophysical bodies and their errors in the medians. This statistical analysis uses the markov chain monte carlo chains.
The uncertainties in the positions are large, so what...
TL;DR Summary: Asking about meta trader platform and what mathematical theories should i read about
Hello :
Recently got my attention a claim about meta trader platform and how you can use it as supportive income source
What is this platform exactly ?
What should I read to be able to use...
TL;DR Summary: I'm looking for a book on statistical/data analysis.
Hey all. I've been doing statistical analysis in my research (such as using PCA and LDA), but I have never received a formal education on statistical analysis or data mining, and what I know about analysis is quite scattered...
The first part of the question asked me to calculate the mean and standard deviation for the number of remain votes in the simple binomial model consisting of total sample size of 2091 people. I believe this is fairly straightforward, it was simply ##E(X) = \mu = 2091(0.5) = 1045.5## votes and...
TL;DR Summary: Finding the probability with one measurement and multiple measurements on separate days.
Question: Hypokalemia is diagnosed when blood potassium levels are low, below 3.5 mmol/L. Let’s assume we know a patient whose measured potassium levels vary daily according to N(µ = 3.8...
Hello, I would like to confirm my answers to the following random variables question. Would anyone be willing to provide feedback and see if I'm on the right track? Thank you in advance.
My attempt:
Hi,
I have been studying the Fisher matrix to apply in a project. I understand how to compute a fisher matrix when you have a simple model for example which is linear in the model parameters (in that case the derivatives of the model with respect to the parameters are independent of the...
Are Boltzman's statistics compatible with deterministic universe? Suppose that the gas molecules in a given container are perfectly elastic objects obeying Newton's laws. Suppose further that we select the initial conditions (impulse and position of each molecule) at random. Is it true that, if...
Assume that ##T## has an Erlang distribution:
$$\displaystyle f \left(t \, | \, k \right)=\frac{\lambda ^{k }~t ^{k -1}~e^{-\lambda ~t }}{\left(k -1\right)!}$$
and ##K## has a geometric distribution
$$\displaystyle P \left( K=k \right) \, = \, \left( 1-p \right) ^{k-1}p$$
Then the compound...
I want to know how did author derive the red underlined term in the below given Example?
Would any member of Math help board enlighten me in this regard?
Any math help will be accepted.
Say there is a gas made up of two gas molecules: Molecule A and Molecule B.
Molecule A has a mass: ma and mole fraction: na.
Molecule B has a mass: mb and mole fraction: nb.
The gas is at thermal equilibrium and has a constant temperature throughout itself (T) everywhere. It is placed in a...
Ancillary statistics! You don't know what this means? I didn't know either, so I looked it up:
http://utstat.toronto.edu/reid/research/A20n41.pdf
As a non-native speaker, I didn't even know what "ancillary" means, so I had to look it up, too. The word has its root in latin "ancilla" which is...
Is someone has already heard about this book wrote by Andre I. Khuri (Professor emeritus in science at university of Florida) ?
By the table of contents the book seems to cover a lot of things in calculus/multivariable calculus and in a rigourous way according to the preface (they argue that...
[Moderator's note: This thread has been split off from a previous thread since its topic is best addressed in a separate discussion. This post has been edited to focus on the topic for separate discussion.]
Jaynes has used in the derivation of the rules of probability as the logic of plausible...
The question is below:
below is my own working;
the mark scheme for the question is below here;
i am seeking for any other approach that may be there...am now trying to refresh on stats...bingo!
I have done the experiment, and have a lot of data. For each data point (we have five), we did ten repetitions, for which we need to do video analysis. The analysis works frame by frame and gives a velocity between each frame. So, to get the value of one repetition, we already need to calculate...
Suppose there are two persons A and B such that both have a personal communication system which can transmit and receive bits. B has a biased coin whose bias is not known. A asks B to toss the coin 2000 times, send a 0 when a tail comes up and a 1 when a head comes up. It is known that whatever...
I have been trying to convince someone that it is wrong to compare the death percentages of two different populations (percentage of death of Covid-19 cases per category: vaccinated vs unvaccinated) in an uncontrolled setting (i.e. real-world data), and conclude that the Covid-19 vaccine does...
David C. Bailey. "Not Normal: the uncertainties of scientific measurements." Royal Society Open 4(1) Science 160600 (2017).
How bad are the tails? According to Bailey in an interview, "The chance of large differences does not fall off exponentially as you'd expect in a normal bell curve," and...
Has the advent of computer trading greatly increased the size of statistics for trading volume? - or do those statistics (for individual stocks) somehow omit the flash trades done by computers?
In the pre-computer days, there were people who had theories of stock trading based on both the...
Is there a " reasonable" way to test for the normality of ##\pi## , i .e., that every digit occurs with the same frequency? Someone suggested randomly sampling strings of size 20 and outputting the frequency. Then I guess we could average the frequencies among samples , use a chi-squared test...
I want to learn some probability & statistics on my own. I am well versed in Calc 1-3 , elementary ODEs and very little linear algebra. I want a comprehensive , introductory textbook which is NOT COOKBOOK STYLE. I might be self studying AP statistics next term so if the book covers everything I...
I am trying to follow a calculation from the book of William C. Saslaw, The Distribution of the Galaxies: Gravitational Clustering in Cosmology. The calculation is shown on the pages following page 122 in chapter 14 where the author talks about the Correlation function.
I am able to reproduce...
I would like to seek your take on the two terms; discrete and continuous in this context,
In my understanding, when we look at height of individuals (in cms), this measure in general or in definition implies continuous data. If we are to look at specific math problem that involves height of say...
Hi,
In simple regression for machine learning , a model :
Y=mx +b ,
Is said AFAIK, to have bias equal to b. Is there a relation between the use of bias here and the use of bias in terms of estimators
for population parameters, i.e., the bias of an estimator P^ for a population parameter P is...
I really don't know what to do for this problem. I looked at similar threads but couldn't seem to grasp the idea of it. I would like help on how to start.
Hi everybody, my name is Vaughny. I was once a statistics tutor, and I loved making statistics easier to understand for those who struggle learning the material. I've seen students in high school and college go through a painful experience of not having enough resources or just having horrible...
I want to ask about the "problem-solving" box on the right.
I don't understand why the class boundaries for the 16 - 25 group are 16 and 26. If I try to find it using my usual way, it will be 15.5 and 25.5 and the midpoint will be (15.5 + 25.5) / 2 = 20.5
Or if I use the hint "since age is...
The weight of goats at a farm is normally distributed with a mean of 60 kg and a standard deviation of 10 kg. A truck used to transport goats can only accommodate not more than 650 kg. If 10 goats are selected at random from the population, what is the probability that the total weight exceeds...
While I will not be showing the graph here, I am trying to dissect what the question even means.
While I do understand that relative uncertainty can be found via the equation ##\frac{\sigma_A}{A}##, I do not understand how I can find the "relative uncertainty of SEM". Does anybody here have any...