- #1

- 2

- 0

Rayms

- Thread starter rayms
- Start date

- #1

- 2

- 0

Rayms

- #2

- 530

- 7

- #3

- 2

- 0

Thanks for the reply bpet. I was already assuming nobody cares and nobody reads my thread. But I exagerrate. The problem is I cannot reduce the dimension of my data anymore, it is already reduced from the original. I guess I have to be more specific what im using the data for. The data sets will be used to come up with regression models. One of my hypotheses is that the validity or predictive accuracy of the models must lie on the internal structure of the data sets used. This internal structure can be described by their distrubutions and other mathematical properties. In other words, im trying to look at diffrences in data structure of the two sets and relate these differences in the resulting models´performance.

- #4

- 530

- 7

The marginals (i.e. individual variables) can be compared using the usual univariate nonparametric two-sample tests (KS, AD, CvM, MW etc).

If no significant differences are found in the marginals, and if the marginals are continuous, you could descale the data by converting the rank order to percentiles. Then try some graphical tools (such as parallel coordinates, andrews plot, scatter matrices etc). There also exist several multivariate distribution-free two sample tests but I don't know a lot about that area.

Also the above assumes that your data sets consist of IID observations, e.g. for time series models other methods might be more suitable.

HTH

- Replies
- 1

- Views
- 2K

- Replies
- 7

- Views
- 1K

- Last Post

- Replies
- 3

- Views
- 2K

- Last Post

- Replies
- 1

- Views
- 2K

- Last Post

- Replies
- 3

- Views
- 2K

- Replies
- 5

- Views
- 1K

- Replies
- 3

- Views
- 12K

- Replies
- 5

- Views
- 5K

- Replies
- 11

- Views
- 3K

- Last Post

- Replies
- 1

- Views
- 1K