- #1

fog37

- 1,568

- 108

- TL;DR Summary
- Forecasting vs Inference

Hello,

Many individuals in machine learning/data science are primarily concerned with prediction only (and not inference) while many in the social sciences are mainly concerned with inference only (and don't care about forecasting).

In the case of inference, we consider a population which we want to study and learn about. We collect a random sample and try to understand the underlying parameters that describe the population and search for causal effects between conceptualized variables. Social scientists are all about inference and are not worried about the forecasting performance of their model. On the other hand, many individuals in the machine learning community are focused on forecasting instead and don't worry about checking assumptions, statistical tests of significance, etc. Why not? Is it because the assumptions can be relaxed and we don't run into issues when we deal with lots of data (standard errors are automatically small, etc.)?

I would think that a model that does good inference would also be good at forecasting since forecast is about the future state of the population. If inference is right then forecasting would tend to be right so good inference and good forecasting don't appear mutually exclusive to me.

Or is it possible to have a model with great predicting performance but very poor inference? And a model that does great inference and is terrible at forecasting?

Thank you

Many individuals in machine learning/data science are primarily concerned with prediction only (and not inference) while many in the social sciences are mainly concerned with inference only (and don't care about forecasting).

In the case of inference, we consider a population which we want to study and learn about. We collect a random sample and try to understand the underlying parameters that describe the population and search for causal effects between conceptualized variables. Social scientists are all about inference and are not worried about the forecasting performance of their model. On the other hand, many individuals in the machine learning community are focused on forecasting instead and don't worry about checking assumptions, statistical tests of significance, etc. Why not? Is it because the assumptions can be relaxed and we don't run into issues when we deal with lots of data (standard errors are automatically small, etc.)?

I would think that a model that does good inference would also be good at forecasting since forecast is about the future state of the population. If inference is right then forecasting would tend to be right so good inference and good forecasting don't appear mutually exclusive to me.

Or is it possible to have a model with great predicting performance but very poor inference? And a model that does great inference and is terrible at forecasting?

Thank you