Statisics - linearity and best-fit in 3 dimensions

Click For Summary

Discussion Overview

The discussion revolves around the analysis of a set of three-dimensional data points (X, Y, Z) to determine linearity, find a line of best fit, and assess the orientation of that line relative to the coordinate axes. The context includes statistical methods applicable to gesture detection, with a focus on the use of Pearson's correlation and linear regression techniques.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant suggests using Pearson's R to test for linearity, but others question how this applies to three-dimensional data, particularly regarding which variables to drop.
  • Another participant proposes transforming coordinates to fit a line in a plane and then applying Pearson's R in that context.
  • There is a distinction made between fitting a line versus fitting a plane to the data, with some participants emphasizing the need for clarity on this point.
  • One participant expresses that they are not trying to predict values but rather to determine if data points fall on a line for gesture detection.
  • There is a discussion about whether two linear regressions can be used to find the intersection of two planes, which would form a line.
  • Another participant mentions the concept of total least squares regression as a potentially more suitable method for minimizing distances from data points to a line.
  • Concerns are raised about the empirical nature of statistical methods and the importance of consulting existing literature or evaluators when applying these methods to real-world problems.
  • One participant clarifies that the gesture detection system discards the Z-plane data when determining gestures, focusing instead on the X and Y data.

Areas of Agreement / Disagreement

Participants express differing opinions on the appropriateness of using Pearson's R in this context, with no consensus reached on the best approach to analyze the data. The discussion remains unresolved regarding the most effective statistical methods to apply.

Contextual Notes

Participants highlight the ambiguity in applying Pearson's R to three-dimensional data without specifying the method of analysis. There are also unresolved questions about the definitions and assumptions underlying the statistical approaches discussed.

  • #31
Let's discuss an utterly simple method. Perhaps objections to it will clarify the problem further.

Let the data points be (x_i,y_i,z_i,t_i) for i = 1 to N. If this data represented a line perfectly parallel to the y-axis then the x and z values would remain constant while the y value varied.

If we have data from an imperfect line, we could estimate the line that the data is "trying to" follow in various ways. The simplest seems to be to estimate that the \hat{x} = the average of the x values in the data and \hat{z} = the average of the z values.

We can quantify the "error" that the imperfect data has in various ways. The one that comes to my mind first

\hat{\sigma^2} = (\frac{1}{N}) { \sum_{i=1}^N ( (\hat{x} - x )^2 + (\hat{z}-z)^2)}

Of course the idea is to classify the data as a gesture parallel to the y-axis when \hat{\sigma^2} is "small". What constitutes a small or large error would have to be determined empirically. You'd have to do a different test for each axis, but this is not an elaborate computation.

One weakness of this method is that it doesn't give any more credit to data that is a perfect straight line but slightly out-of-parallel vs data that is scattered. Is this weakness the reason that you are considering sophisticated statistical methods?
 

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 10 ·
Replies
10
Views
4K
  • · Replies 1 ·
Replies
1
Views
4K
Replies
14
Views
11K
  • · Replies 3 ·
Replies
3
Views
4K
Replies
1
Views
2K
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 1 ·
Replies
1
Views
4K