Find the best function grouping with an iterative method

Click For Summary

Discussion Overview

The discussion revolves around the challenge of grouping 300 functions into 10 subgroups of 30 functions each, with the objective of minimizing the sum of the areas subtended by these functions. Participants explore iterative methods and algorithms suitable for this optimization problem, considering constraints and computational feasibility.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant suggests calculating the area for each function and proposes a cost function to minimize the sum of these areas across groups.
  • Another participant mentions the potential use of evolutionary algorithms or simulated annealing to tackle the problem, noting the vast search space involved.
  • A later reply introduces an additional constraint where functions should be oppositely oriented with respect to the X-axis, aiming for a total area of zero within each subgroup.
  • Participants discuss the implications of adding this constraint, highlighting the increased complexity due to the need to compute pairwise areas between functions.
  • Alternative formulations for the cost function are proposed, including adjustments to account for the new constraint on function orientation.

Areas of Agreement / Disagreement

Participants express varying opinions on the best approach to the problem, with no consensus on a single method or solution. The discussion reflects multiple competing views on how to incorporate additional constraints and the associated computational challenges.

Contextual Notes

The discussion acknowledges the limitations of computational resources and the complexity introduced by additional constraints, such as the need to calculate numerous pairwise areas.

MathBoard001
Messages
2
Reaction score
0
Hello to everyone!
I'm having a problem, banal in form, but perhaps complex in practical application.
Let's suppose we have 300 functions $$y = f (x)$$ of random trend and we want to group these functions into 10 subgroups (each group consisting of 30 functions).
However, this grouping must not be random, but such that the sum of the areas subtended by these functions is the 'minimum' possible in each group (there should be a sort of balance between the 10 soubgroups, preferably the best balance).
The ideal solution is to have $$Area = 0 $$ for each subgroup!

According to you which iterative method, it would be preferable for a problem like this?

Comparing the functions one by one is definitely unthinkable but of course I would like to use a robust algorithm that programatically would be able to subgroup in a good (not necessary perfect) way.

Do you have any precious suggestions?

ps. Sorry for my English, maybe not perfect!
 
Last edited:
Mathematics news on Phys.org
That's a very interesting problem! Thanks for posting! Here are a few ideas:

1. Calculate the area for each function, call it $A_i$, for $1\le i\le 300$. We'll denote each function by $f_i$.
2. You have ten groups, and your goal is to assign thirty functions to each group so as to get the sum of the areas as close to zero as possible. This would suggest a cost function of the form
$$C=\sum_{\text{groups}}\left[\,\sum_{f_i\in\;\text{group}}A_i\right]^2.$$
You have to square the inner sum, or take its magnitude, or else the algorithm will happily try to make the sums as negative as possible. This cost function is what you want to minimize. Now, your search space is absolutely gargantuan:
$$\frac{300!}{(30!)^{10}}\approx 1.78\times 10^{290}.$$
An exhaustive search is not possible, as you've already mentioned.

I would probably use an evolutionary algorithm, maybe with simulated annealing, to solve this one. But there's certainly no guarantee you'll find the absolute best possible solution.

Thankfully, this is not a 2SAT or 3SAT kind of problem, where whether one assignment of T/F to the variables gives you zero feedback on whether your next guess is better or worse.
 
That's really interesting. I've deepened the 'simulated annealing' and I think that it could work if we start with a reasonable starting condition. I really appreciated your idea.
I would like to deepen the problem in this way:
Let's suppose we need to add a further constraint in this method. The constraint is that the functions should be as much opposite to X axis as possible between each other.
For example if we consider 4 functions (instead of 30) in the subgroup, one ideal combination is this one:

View attachment 8041
In this case, as you can see:

1. Total Area = 0 [positive area - negative Area = 0]
2. Function 1 is opposite to Function 2 and Function 3 is opposite to Function 4.

Then there is a perfect match between the 4 functions!
I would like something similar to that (not necessary a perfect opposition!). Just to be clear, if we have group of two functions; the positive maximum of the first function should correspond to the negative minimum of the second function and so on!

I was thinking to split the domain in more parts and perform a check in the single domains, but maybe this is a massive operation.

Do you have any suggestion?
 

Attachments

  • figurae_example.png
    figurae_example.png
    5.2 KB · Views: 137
Last edited:
What's the bigger context of this problem? I think what you're asking is going to increase the complexity of the problem a great deal, because now you have a lot more areas to calculate in advance. You'd have to something like define the $i,j$ area:
$$A_{ij}=\int_{x\,\in\,\mathcal{D}(f_i)\cap\mathcal{D}(f_j)}[f_i(x)-f_j(x)]\,dx,$$
so instead of three hundred areas, you now have $44700$ areas to compute! (You can divide the $300\times 300$ by two because $A_{ij}=-A_{ji}$, and then subtract $300$ because $A_{ii}=0$.) Maybe that's ok, but that's a lot of computation. You could re-cast your cost function like this:
$$C=\sum_{\text{groups}}\;\sum_{\begin{array}{c}f_i,\, f_j\;\in\;\text{group}\\ i<j\end{array}}A_{ij}^2.$$
I'm moving the square inside, because we don't want one of the $A_{ij}$ to cancel with another one. Another option is to define
$$A_{ij}=\int_{x\,\in\,\mathcal{D}(f_i)\cap\mathcal{D}(f_j)}[f_i(x)-f_j(x)]^2\,dx,$$
and then not bother squaring in the cost function. This approach will more closely line up your functions across the $x$ axis, but it will be more computationally expensive, as I've pointed out.
 

Similar threads

Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
5
Views
1K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 15 ·
Replies
15
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K