Statistical significance of a ML model...

  • Thread starter Thread starter fog37
  • Start date Start date
  • Tags Tags
    Linear regression
Click For Summary
SUMMARY

Determining the statistical significance of machine learning (ML) models, such as decision trees, support vector machines (SVM), and neural networks, involves using traditional statistical tests like t-tests and F-tests. While these tests are commonly applied to linear and logistic regression models, their application to ML models requires careful consideration of data partitioning to avoid overfitting. The emerging field of uncertainty quantification (UQ) is actively developing methods to assess the statistical significance of ML models. Key practices include setting aside a portion of the dataset for testing to ensure valid significance testing.

PREREQUISITES
  • Understanding of t-tests and F-tests in statistical analysis
  • Familiarity with machine learning models such as decision trees, SVM, and neural networks
  • Knowledge of data partitioning techniques for model validation
  • Basic concepts of uncertainty quantification (UQ)
NEXT STEPS
  • Research the application of t-tests and F-tests to machine learning models
  • Explore uncertainty quantification (UQ) methodologies in ML
  • Learn about data partitioning strategies for model validation
  • Investigate alternative statistical tests suitable for ML model significance
USEFUL FOR

Data scientists, machine learning practitioners, and statisticians interested in validating the performance and significance of predictive models.

fog37
Messages
1,566
Reaction score
108
TL;DR
Determining if a ML model is statistically significant...
Hello,

How do we check if a ML model is statistically significant? For models like linear regression, logistic regression, etc. there are tests (t-tests, F-tests, etc.) that will tell us if the model, trained on some dataset, is statistically significant or not.

But in the case of ML models, like decision trees, SVM, or neural nets, how do we determine if the model is statistically significant? I have not seen any specific test to do that...

Thank you!
 
Technology news on Phys.org
There is a whole subfield on this called UQ - uncertainty quantification. It is an area or active development.
 
fog37 said:
TL;DR Summary: Determining if a ML model is statistically significant...

But in the case of ML models, like decision trees, SVM, or neural nets, how do we determine if the model is statistically significant? I have not seen any specific test to do that...
The t test will work with any predictive model. You're supposed to set aside a part of the input data, and not use it in your model and use it for testing later. (Because predicting your input data with a ML model is cheating). For a yes/no model, you can score a 1 for correct, and 0 for wrong, and you can compare it other ways to predict the outcomes (or random guessing),
 
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 28 ·
Replies
28
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
Replies
7
Views
2K
Replies
3
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 22 ·
Replies
22
Views
2K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 24 ·
Replies
24
Views
2K