Identifying the largest determining factor

  • Thread starter Thread starter SanDiegoMike
  • Start date Start date
Click For Summary
To determine the largest factor influencing coin value, a statistical analysis approach is recommended, focusing on the characteristics of year, color, size, shape, and weight. While scatter plots can help visualize relationships, the discussion emphasizes the need for a more mathematical methodology due to the large data sets involved. Categorical variables, such as shape, present challenges in analysis, suggesting the use of box-and-whisker plots for better insights. Data exploration is crucial, and visual tools should not be overlooked despite the preference for mathematical methods. The conversation highlights the importance of combining visual and statistical techniques for effective analysis.
SanDiegoMike
Messages
4
Reaction score
0
Identifying the "largest determining factor"

Hello,

This is not strictly a statistics forum, but I'm hoping you guys may have sufficient background to help me out. I'm an engineer by trade, so my stats background is poor and I have had not had much luck searching for the answer or asking colleagues.

I have a database, which is similar to the following example in which we list a collection of 'coins' of various values. These coins all have different characteristics, ie: year, color, size, shape, and weight. I would like to determine to what degree each of those characteristics are most likely to determine the coin's value. Or at the very least, determine which of the characteristics is most predominant. My searching has led me to analysis which requires some form of functional relationship between say 'shape' and 'value' such that correlation can be determined, but I don't know how I would convert shape (ie: circle, square, octagonal) into a variable. My colleagues have suggested scatter plots to identify relationships, but my data sets are huge, and I would prefer something with a mathematical foundation.

If anyone could point me in the correct direction with regards to the appropriate analysis methodology, that would be fantastic.

thanks,
-mike.
 
Physics news on Phys.org


SanDiegoMike said:
My colleagues have suggested scatter plots to identify relationships, but my data sets are huge, and I would prefer something with a mathematical foundation.

Data exploration is arguably the most important step of any data analysis methodology so I wouldn't discount visual tools just yet. Maybe start with a scatter plot matrix for the continuous variables and box-and-whisker plots for the categorical variables.
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

  • · Replies 0 ·
Replies
0
Views
412
  • · Replies 15 ·
Replies
15
Views
4K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 17 ·
Replies
17
Views
2K
Replies
5
Views
2K
Replies
0
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 6 ·
Replies
6
Views
4K