MHB Find the best function grouping with an iterative method

AI Thread Summary
The discussion revolves around grouping 300 functions into 10 subgroups of 30 functions each, aiming to minimize the sum of the areas under the curves in each group. An evolutionary algorithm, particularly simulated annealing, is suggested as a viable iterative method due to the vast search space involved. The complexity increases with an additional constraint requiring functions in each group to be oppositely oriented relative to the x-axis, necessitating the calculation of pairwise areas between functions. The proposed cost function adjustments aim to ensure that the functions are balanced and oppositely aligned, although this approach significantly raises computational demands. The conversation highlights the challenges of achieving an optimal grouping while maintaining computational feasibility.
MathBoard001
Messages
2
Reaction score
0
Hello to everyone!
I'm having a problem, banal in form, but perhaps complex in practical application.
Let's suppose we have 300 functions $$y = f (x)$$ of random trend and we want to group these functions into 10 subgroups (each group consisting of 30 functions).
However, this grouping must not be random, but such that the sum of the areas subtended by these functions is the 'minimum' possible in each group (there should be a sort of balance between the 10 soubgroups, preferably the best balance).
The ideal solution is to have $$Area = 0 $$ for each subgroup!

According to you which iterative method, it would be preferable for a problem like this?

Comparing the functions one by one is definitely unthinkable but of course I would like to use a robust algorithm that programatically would be able to subgroup in a good (not necessary perfect) way.

Do you have any precious suggestions?

ps. Sorry for my English, maybe not perfect!
 
Last edited:
Mathematics news on Phys.org
That's a very interesting problem! Thanks for posting! Here are a few ideas:

1. Calculate the area for each function, call it $A_i$, for $1\le i\le 300$. We'll denote each function by $f_i$.
2. You have ten groups, and your goal is to assign thirty functions to each group so as to get the sum of the areas as close to zero as possible. This would suggest a cost function of the form
$$C=\sum_{\text{groups}}\left[\,\sum_{f_i\in\;\text{group}}A_i\right]^2.$$
You have to square the inner sum, or take its magnitude, or else the algorithm will happily try to make the sums as negative as possible. This cost function is what you want to minimize. Now, your search space is absolutely gargantuan:
$$\frac{300!}{(30!)^{10}}\approx 1.78\times 10^{290}.$$
An exhaustive search is not possible, as you've already mentioned.

I would probably use an evolutionary algorithm, maybe with simulated annealing, to solve this one. But there's certainly no guarantee you'll find the absolute best possible solution.

Thankfully, this is not a 2SAT or 3SAT kind of problem, where whether one assignment of T/F to the variables gives you zero feedback on whether your next guess is better or worse.
 
That's really interesting. I've deepened the 'simulated annealing' and I think that it could work if we start with a reasonable starting condition. I really appreciated your idea.
I would like to deepen the problem in this way:
Let's suppose we need to add a further constraint in this method. The constraint is that the functions should be as much opposite to X axis as possible between each other.
For example if we consider 4 functions (instead of 30) in the subgroup, one ideal combination is this one:

View attachment 8041
In this case, as you can see:

1. Total Area = 0 [positive area - negative Area = 0]
2. Function 1 is opposite to Function 2 and Function 3 is opposite to Function 4.

Then there is a perfect match between the 4 functions!
I would like something similar to that (not necessary a perfect opposition!). Just to be clear, if we have group of two functions; the positive maximum of the first function should correspond to the negative minimum of the second function and so on!

I was thinking to split the domain in more parts and perform a check in the single domains, but maybe this is a massive operation.

Do you have any suggestion?
 

Attachments

  • figurae_example.png
    figurae_example.png
    5.2 KB · Views: 119
Last edited:
What's the bigger context of this problem? I think what you're asking is going to increase the complexity of the problem a great deal, because now you have a lot more areas to calculate in advance. You'd have to something like define the $i,j$ area:
$$A_{ij}=\int_{x\,\in\,\mathcal{D}(f_i)\cap\mathcal{D}(f_j)}[f_i(x)-f_j(x)]\,dx,$$
so instead of three hundred areas, you now have $44700$ areas to compute! (You can divide the $300\times 300$ by two because $A_{ij}=-A_{ji}$, and then subtract $300$ because $A_{ii}=0$.) Maybe that's ok, but that's a lot of computation. You could re-cast your cost function like this:
$$C=\sum_{\text{groups}}\;\sum_{\begin{array}{c}f_i,\, f_j\;\in\;\text{group}\\ i<j\end{array}}A_{ij}^2.$$
I'm moving the square inside, because we don't want one of the $A_{ij}$ to cancel with another one. Another option is to define
$$A_{ij}=\int_{x\,\in\,\mathcal{D}(f_i)\cap\mathcal{D}(f_j)}[f_i(x)-f_j(x)]^2\,dx,$$
and then not bother squaring in the cost function. This approach will more closely line up your functions across the $x$ axis, but it will be more computationally expensive, as I've pointed out.
 
Suppose ,instead of the usual x,y coordinate system with an I basis vector along the x -axis and a corresponding j basis vector along the y-axis we instead have a different pair of basis vectors ,call them e and f along their respective axes. I have seen that this is an important subject in maths My question is what physical applications does such a model apply to? I am asking here because I have devoted quite a lot of time in the past to understanding convectors and the dual...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Back
Top