Statistics Linearity Question

In summary, the conversation discusses the measurement of linearity in two different experimental curves for stepper motor step sizes. The individual is unsure about which statistical method to use, with some sources recommending calculating the R2 value while others suggest using a regression line. It is clarified that R2 is a correlation coefficient, and while it may be one way to determine linearity, it is not necessarily the most accurate method. Other suggestions are made, such as plotting residuals or creating a histogram to show the normal distribution of the data. Ultimately, the best method depends on the level of accuracy desired.
  • #1
roam
1,271
12

Homework Statement



I have two different experimental curves, and I would like to measure how closely a straight line fits each data, and which curve is more crooked. In statistics how can I measure this "linearity"?

By the way this is about stepper motor step linearity (ideally it has to be a straight line i.e. homogeneous step sizes). I am comparing the two plots made for two different speeds:

1582o3.jpg

Homework Equations

The Attempt at a Solution



I'm new to stats and I'm not sure what method to use. I'm very confused because some websites say I have to calculate the ##R^2## value, while others say I need a some kind of regression line. :confused:

So, if the linearity could somehow be determined from the equation of regression line, what kind of regression do I need to use (linear or quadratic, cubic, etc)? And how exactly do I determine linearity from that equation?

Any explanation is greatly appreciated.

P.S. I am using Matlab.
 
Physics news on Phys.org
  • #2
roam said:
I'm very confused because some websites say I have to calculate the R2R^2 value, while others say I need a some kind of regression line.

I am not sure of the nomenclature, but I assume R2 is just a correlation coefficient, which in this case is a measure of how good the linear regression is. Two sides of the same coin.
 
  • Like
Likes roam
  • #3
Borek said:
I am not sure of the nomenclature, but I assume R2 is just a correlation coefficient, which in this case is a measure of how good the linear regression is. Two sides of the same coin.

Thank you for the clarification. A high ##R^2## is what I think I will need to show good linearity.
 
  • #4
R2 may be one way to do it but remember that is just a measure of how "far" away your linear fit is from the data (in the R2 it is squared to get rid of negative numbers and somehow normalized such that a perfect fit gets you a value of 1). You can have a lower R2 from noisy data which are still linear or from data which are not described well by a linear equation. What I would do is to fit a line, calculate the residuals, then either show the residuals are just noise with respect to the independant variable (this would just be a plot showing that there is no pattern to the residuals) or you can make a histogram/frequency plot of the residuals and show that they follow a gaussian/normal type of distribution.

It all depends on how far you want to go to show the linearity of your data (sometimes just plotting your line and data on the same graph is enough).
 
  • Like
Likes roam
  • #5


In order to measure the linearity of your experimental curves, you can use the coefficient of determination (R^2). This value measures the proportion of the variation in the data that is explained by the regression line. In this case, a higher R^2 value would indicate a better fit to a straight line and therefore, a higher linearity.

You can use linear regression to determine the equation of the regression line for each of your curves. The equation will have the form y = mx + b, where m is the slope of the line and b is the y-intercept. The slope (m) represents the change in y for every unit change in x and can serve as a measure of linearity. If m is close to 0, then the line is almost horizontal and indicates low linearity. If m is large, then the line is steep and indicates high linearity.

You can also visually compare the two regression lines to see which one fits the data more closely. If one line is closer to the majority of the data points, then it would indicate a higher linearity compared to the other curve.

In Matlab, you can use the "fitlm" function to perform linear regression and obtain the R^2 value and the equation of the regression line. You can also plot the regression line on top of your data points to visually assess the linearity.

Ultimately, the choice of which method to use (R^2 value or regression line) depends on your specific research question and what information you are trying to obtain from your data.
 

1. What is linearity in statistics?

Linearity in statistics refers to the relationship between two variables being linear, or directly proportional. This means that as one variable increases, the other variable also increases in a consistent manner.

2. Why is linearity important in statistics?

Linearity is important in statistics because it allows us to make accurate predictions and interpretations about the relationship between variables. It also helps us to determine the strength and direction of the relationship between variables.

3. How do you test for linearity in statistics?

There are several ways to test for linearity in statistics, but the most common method is to create a scatter plot of the data and visually assess whether the points follow a linear pattern. Other methods include using correlation coefficients or statistical tests such as the F-test or chi-square test.

4. What are some common misconceptions about linearity in statistics?

One common misconception is that all relationships between variables must be linear. In reality, relationships can be nonlinear and still be statistically significant. Another misconception is that linearity implies causation, when in fact it only indicates a strong correlation between variables.

5. How can nonlinearity affect the results of statistical analysis?

Nonlinearity can lead to incorrect conclusions and inaccurate predictions. This is because models and assumptions made based on a linear relationship may not hold true for nonlinear data. It is important to properly identify and address nonlinearity in statistical analysis to ensure the validity of the results.

Similar threads

  • Programming and Computer Science
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
852
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
458
  • STEM Educators and Teaching
Replies
11
Views
2K
  • Programming and Computer Science
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
2
Replies
64
Views
3K
Back
Top