- #1
sheelbe999
- 13
- 0
Here's the problem I'm trying to solve,
We have an N dimensional surface. We do not know the form of this surface however we have datapoints which are likely to be close, though not guaranteed to be on the surface, with some outliers. What I want to do is determine where in N dimensional space the maxima and minima are, global and local. I want to only find the statistically significant maxima and minima.
Currently here's what I'm doing to solve this, I generate k nodes at random in N dimensional space. I then use the k means algorithm to cluster the datapoints into k populations about these k nodes. I then look at the average value in each cluster. This seems to work quite well for now however the issue of determining whether a maxima/minima is statistically significant (far enough away from the average value and enough data points for confidence). In addition I would like to eventually be able to come up with a smart way of discarding dimensions if they are uninformative in locating maxima/minima.
I know this is a complex problem to solve, any help would be appreciated. I wouldn't mind if I was only able to find the global maxima/minima with statistical significance, would be further along from where I am now.
We have an N dimensional surface. We do not know the form of this surface however we have datapoints which are likely to be close, though not guaranteed to be on the surface, with some outliers. What I want to do is determine where in N dimensional space the maxima and minima are, global and local. I want to only find the statistically significant maxima and minima.
Currently here's what I'm doing to solve this, I generate k nodes at random in N dimensional space. I then use the k means algorithm to cluster the datapoints into k populations about these k nodes. I then look at the average value in each cluster. This seems to work quite well for now however the issue of determining whether a maxima/minima is statistically significant (far enough away from the average value and enough data points for confidence). In addition I would like to eventually be able to come up with a smart way of discarding dimensions if they are uninformative in locating maxima/minima.
I know this is a complex problem to solve, any help would be appreciated. I wouldn't mind if I was only able to find the global maxima/minima with statistical significance, would be further along from where I am now.