Optimization problem with multiple outputs: impossible?

  • #1
FranzS
54
12
TL;DR Summary
Optimization problem, multiple outputs desired (best data subdivisions): is there a way to address it?
Hello,

I'm facing a practical optimization problem for which I don't know whether a standard approach exists or not.
I would have liked to rephrase the problem in a more general way, for the sake of "good math", but I'm afraid I would leave out some details that might be relevant. So, I'm going to explain the specific real-life application and then I'll pose the related problem. Hopefully this won't be too tedious!1. Application

I need to machine (on a CNC lathe) a number ##N## of different part types (beware, ##N## is not the batch size, it's the number of the different part drawings), which are rings with a rectangular cross-section (i.e. solids of revolution [for a full 360°] where the rectangular cross-section is not adjacent to the axis of revolution), each identified by the triplet ##(d_i, D_i, h_i)##, where I'm calling ##d_i## the internal diameter, ##D_i## the external diameter, ##h_i## the height/thickness of the cross-section (the longitudinal one, as the radial thickness is already defined by ##d_i## and ##D_i##) and finally ##i=1, 2, ..., N## is just the index that identifies the specific part drawing.
Actually, ##d_i, D_i, h_i## aren't really the final dimensions: their values already include the machining allowance (i.e. they are the dimensions of the raw material, as if each of the part types had its own individual one – which is not what I want, see "2. Scope and Examples").
Also, I'm not considering additional parameters such as yearly sales forecast for each size (for simplicity and also because it's actually pretty close for all sizes).2. Scope and Examples

I have to select the proper dimensions (internal diameter and external diameter) for the raw material (steel tubes) in order to minimize scrap material during CNC machining.
The point is that – as anticipated – I don't want to select a different raw material size for each individual ##i##-th part type, but I want to group several part types "under certain raw material sizes". This will of course increase scrap material, but pricewise it's still more convenient than buying ##N## individual raw material sizes (as they're going to be custom-made by the supplier: the fewer the chosen sizes and the higher the quantity [kilograms or meters] for each size, the lower the price).

I will call ##M## the number of chosen raw material sizes (theoretically, I'll have ##1 \leq M \leq N##). In other words, ##M## is the number of disjoint subsets I want to define.
So, I'll start by listing the ##N## triplets ##(d_i, D_i, h_i)## in a precise order (for example, in a column) so that ##d_i \leq d_{i+1}## and ##D_i \leq D_{i+1}## ##\forall i##. It just so happens that the case ##d_i=d_{i+1} \wedge D_i=D_{i+1}## never occurs. Thus, no ordering needs to be specified for ##h_i##. The list will then be:

$$(d_1, D_1, h_1)$$
$$(d_2, D_2, h_2)$$
$$...$$
$$(d_i, D_i, h_i)$$
$$...$$
$$(d_N, D_N, h_N)$$

As said, this list will be divided in a number ##M## of disjoint subsets and each subset will contain adjacent triplets only.
I'll write a couple of simple examples for better clarity.

Example 1 (trivial, unwanted solution)
Let's say ##N=20## and ##M=1##. This is like saying a single raw material size will be chosen for machining all part types.
Thus, the raw material (steel tube) will have to have the smallest internal diameter (which is ##d_1##) and the largest external diameter (which is ##D_{20}##). Please remember that material allowance for proper CNC machining is already accounted for.

Example 2 (trivial, unwanted solution)
On the opposite side of the spectrum, let's still consider ##N=20## but now with ##M=20## as well. This is like saying I want to select a different raw material size for each individual ##i##-th part type (that's exactly what I wrote I do not want to do in the real-life application).
In this case, I will have ##N## different raw material sizes, one for each of the ##N## different part types and with the very same dimensions as well.

Example 3 (possible acceptable solution)
Let's get closer to what I'd like to achieve by considering ##N=20## and ##M=4##. This is like saying four raw material sizes will be properly chosen, each for machining the corresponding subset of part types.
For example, the best solution for scrap material minimization could be:
– Raw material size #1 (smallest): internal diameter ##d_1##, external diameter ##D_4##
– Raw material size #2: internal diameter ##d_5##, external diameter ##D_{11}##
– Raw material size #3: internal diameter ##d_{12}##, external diameter ##D_{16}##
– Raw material size #4 (largest): internal diameter ##d_{17}##, external diameter ##D_{20}##

Please note!
It is always the case that the internal diameter of the smallest raw material size is ##d_1##.
It is always the case that the external diameter of the largest raw material size is ##D_N##.
It is always the case that, if a certain raw material size has an external diameter ##D_i##, the next larger raw material size will have an internal diameter ##d_{i+1}## (i.e. the subsets are disjoint and "adjacent to each other").
All this holds true for any value of ##M##.3. Problem

Given ##1<M<N## (given/decided in advance a value for ##M## that is strictly greater than ##1## and strictly less than ##N##), how can I find the exact boundaries of the ##M## subsets that minimize scrap material?
For example, going back to the specific Example 3 above (with ##N=20## and ##M=4##), is there a way to find the optimal values ##j##, ##k## and ##l## (there are ##M-1## of them) that will allow me to select the following raw material optimized sizes?
– Raw material size #1 (smallest): internal diameter ##d_1##, external diameter ##D_j##
– Raw material size #2: internal diameter ##d_{j+1}##, external diameter ##D_k##
– Raw material size #3: internal diameter ##d_{k+1}##, external diameter ##D_l##
– Raw material size #4 (largest): internal diameter ##d_{l+1}##, external diameter ##D_{20}##

By the way, according to the geometry of the problem, and considering a specific example with ##M=4##, the function ##S## to be minimized (total volume of scrap material for all part types) should be, if I'm not mistaken:

$$
S(j,k,l) =
\left(
\sum_{i=1}^j
\frac{\pi}{4}
h_i
\left(
d_i^2 - d_1^2 + D_j^2 - D_i^2
\right)
\right)
+
\left(
\sum_{i=j+1}^k
\frac{\pi}{4}
h_i
\left(
d_i^2 - d_{j+1}^2 + D_k^2 - D_i^2
\right)
\right)
$$
$$
+
\left(
\sum_{i=k+1}^l
\frac{\pi}{4}
h_i
\left(
d_i^2 - d_{k+1}^2 + D_l^2 - D_i^2
\right)
\right)
+
\left(
\sum_{i=l+1}^N
\frac{\pi}{4}
h_i
\left(
d_i^2 - d_{l+1}^2 + D_N^2 - D_i^2
\right)
\right)
$$

How would you go on from here?
Of course I'm not interested in a trial & error / brute force method.
I sincerely don't know if an analytic method exists and how it could possibly give ##M-1## results "at once" (via a system of equations? How to get there???).
What I'm sure about is I don't know how to approach this problem. I've not in my life studied/explored any advanced math environments/theories that might apply to this case.
Digression: this also reminds me of similar (?) optimization problems that I've faced in the past (and not solved!). For example, I was trying to find the best "two-piece" polynomial approximation of a certain function ##f(x)## on an interval ##x \in [a,b]## (linear approximation for ##a<x<c## and then quadratic approximation for ##c<x<b##), searching for the optimized value of ##c##.4. My Attempt at Solving a Simplified Case with ##M=2##
(and Why I Think I Can't Solve the Cases with ##M>2##)


NOTE: this is only for the brave (you already have my full appreciation if you stayed with me till here!)

Well, for this simplified example, I will assume that all the ##N## part types have the same height/thickness ##h##.
Thus, ##h## becomes irrelevant and the total amount of scrap material for all part types does not need to be intended as "scrap volume" anymore, but rather as "scrap area" only (I hope the concept is clear).
By the way, this is actually a good approximation in the real-life application, as all ##h_i## are very similar to each other.
With this assumption, and considering ##M=2## (i.e. I want two raw material sizes to be used for the production of all the ##N## part types), I now have to minimize the following function ##S## (total "scrap area"):

$$
S(j) =
\left(
\sum_{i=1}^j
\frac{\pi}{4}
\left(
d_i^2 - d_1^2 + D_j^2 - D_i^2
\right)
\right)
+
\left(
\sum_{i=j+1}^N
\frac{\pi}{4}
\left(
d_i^2 - d_{j+1}^2 + D_N^2 - D_i^2
\right)
\right)
$$

Then, I can simplify the expression above by ditching the common factor ## \pi / 4 ## that appears everywhere (I will call this new function ## \sigma ## instead of ##S##, as it's not the same anymore despite being still equivalent to it for my optimization purposes) and by rearranging the sums in order to isolate the constant terms:

$$

\sigma (j) =

\left(
\sum_{i=1}^j
D_j^2
\right)

-

\left(
\sum_{i=1}^j
d_1^2
\right)

+

\left(
\sum_{i=j+1}^N
D_N^2
\right)

-

\left(
\sum_{i=j+1}^N
d_{j+1}^2
\right)

-

\left(
\sum_{i=1}^N
D_i^2 - d_i^2
\right)

$$

The last sum can be calculated promptly, as it depends only on the dimensions of all the ##N## part types, which are known values since the beginning. For a shorter notation, I will call this sum ## \Delta_0 ##.
The second and third sums only involve the constants ##d_1## and ##D_N## respectively, hence we can rewrite them in a more compact form.
The same applies to the remaining two sums, which involve ##D_j## and ##d_{j+1}##, which are independent from the index of summation ##i##.
So, the expression becomes:

$$

\sigma (j) = j D_j^2 - (N-j) d_{j+1}^2 - j d_1^2 + (N-j) D_N^2 - \Delta_0

$$

Now, for the crucial part of the solution.
I wrote the function ##\sigma## as ##\sigma(j)##, but it seems like it depends also on ##D_j## and ##d_{j+1}##.
However, despite ##j## being just an index, I should manage to find a relation between ##j## and ##D_j##, and another relation between ##j## and ##d_{j+1}##.
In the real-life case, it turns out (luckily) that both those relations can be well approximated by linear functions. Let's call ##u(j)=u_0+u_1 j## the linear function of ##j## that well approximates ##D_j##, and ##v(j)=v_0+v_1 j## the linear function of ##j## that well approximates ##d_{j}## (we'll then input ##j+1## to obtain ##v(j+1) \approx d_{j+1}##). All coefficients ##u_0, u_1, v_0, v_1## are to be found with linear regression from the initial data (dimensions of all part types).
With these approximations, ##\sigma## becomes an explicit function of ##j## alone. Differentiate ##\sigma(j)## with respect to ##j##, cross your fingers and hope for the best.

Well, actually, I'm not really sure about this. For the linear approximations, I've got a vague feeling that we are applying the two linear regressions in the "##i## world" instead of the "##j## world". Sloppy way of expressing myself, I know, but I hope you get the idea.
My doubts also arise from the fact that – if I apply the same reasoning in the case with ##M>2## (so that I will have more than one "boundary index", i.e. ##j##, ##k##, etc.) – the ##j##, ##k##, etc. indexes themselves will be indistinguishable using my method.
Please enlighten me!THE END
My sincere apologies for the lengthy post, and my biggest "thank you" to all of you who took the time to read it.
 
Last edited:
  • Like
Likes WWGD and berkeman
Mathematics news on Phys.org
  • #2
Say we have constant ##h##, and ##M## different supplied block shapes and ##N## different shapes to produce, and a vector ##\vec a\in \mathbb N^{M+1}## such that ##0=a_0<a_1<...< a_M=N##, which specifies that the ##(k+1)##-th block shape has inner and outer radii ##d_{a_k+1}, D_{a_{k+1}}##.

Then wastage will be proportional to:
$$\sum_{k=0}^{M-1} \sum_{i=a_k + 1}^{a_{k+1}}
\left({d_i}^2 - {d_{a_k+1}}^2 + {D_{a_{k+1}}}^2 - {D_{i}}^2\right)
=
\sum_{i=1}^N\left({d_i}^2 - {D_{i}}^2\right)
+
\sum_{k=0}^{M-1} \sum_{i=a_k + 1}^{a_{k+1}}
\left({D_{a_{k+1}}}^2 - {d_{a_k+1}}^2\right)
$$
The first term is independent of ##\vec a## so we minimise wastage by minimising
$$\sum_{k=0}^{M-1} \sum_{i=a_k + 1}^{a_{k+1}}
\left({D_{a_{k+1}}}^2 - {d_{a_k+1}}^2\right)
=
\sum_{k=0}^{M-1} \left(a_{k+1} - a_k - 1\right)
\left({D_{a_{k+1}}}^2 - {d_{a_k+1}}^2\right)
$$
You have further narrowed it down by constraining the ##d_i,D_i## to be linear functions of ##i## so that ##d_i=v_0 + v_1\, i, D_i = u_0+u_1\,i##. Then we need to choose ##\vec a## to minimise:
$$\sum_{k=0}^{M-1}
\left(a_{k+1} - a_k - 1\right)
\left(
\left(u_0+u_1 a_{k+1} \right)^2 -
\left(v_0+v_1 (a_k+1) \right)^2
\right)
$$
It's not a solution, but it covers the ##M##-dimensional case with a reasonably simple formula. I think it should be possible to proceed further from there.
I intuit that the presence of the squares will make the optimal answer have gaps between consecutive components of ##\vec a## that narrow as ##i## increases. So the increments of block radii will decrease as the blocks get larger - the opposite of how it works for cogs on a bike cassette. I think we should be able to get a rough formula to implement that.

But it's late here so I won't try right now.
 
  • Like
Likes FranzS
  • #3
I realised we can use the last formula above directly in a non-linear optimiser to find a result.
I used the R package nlm, with the following code:
R code to optimise selection of input blocks:
library(nlm)

# input parameters ----------

M <- 3 # number of different input shapes
N <-  20 # number of different output shapes

# parameters for output shapes
v0 <- 0
u0 <- 1
v1 <- u1 <- 1

# express guess as increments of index, not as absolute indices.
startguess <- c(10, 6)

if (any(startguess < 0) | sum(startguess) >= N)
  stop('capt says startguess is not valid')

# wastage function ---------
wastage <- function(avec){
  # avec is a vector of length M-1, with avec[k] = a_k
  x <- cumsum(avec)
  a <- c(0, x)
  a1 <- c(x, N)
  if (any(avec < 0) | any(x >= N) )
    # input outside of constraints, so return huge wastage value to disqualify this input
    return(10^8) else
      # calc the wastage properly
      return( sum((a1 - a - 1) * ( (u0 + u1 * a1)^2 - (v0 + v1 * (a + 1) )^2 ) ) )
}
# perform nonlinear optimisation 
x <- nlm(f = wastage, p = startguess )
#  print results
print(x)

This has N=20 output shapes, having inner and outer diameters of 1, 2, ..., 20 and 2, 3, ..., 21 respectively, and allows M = 3 different input shapes. Following my logic above, I assumed the diameter increments between input shapes should reduce for larger shapes, and so guessed indices of 10, 16, 20. We ignore the 20 because that always has to be the biggest index, so that leaves our guess as 10, 16. For the program we express these as increments rather than absolute indices, so the starting guess for the optimiser (variable 'startguess') has two elements: 10 = 10 - 0 and 6 = 16 - 10.

It gave the following output:
Optimiser results:
$minimum
[1] 2311.83

$estimate
[1] 8.935301 6.015709

$gradient
[1] -1.763963e-04 -1.776443e-05

$code
[1] 1

$iterations
[1] 8
The result is in $estimate and to the nearest integer is 9 and 6, which gives indices of 9, 15, 20. So our initial guess of 10, 16, 20 was not bad.

I haven't tried it but the code should work for any valid combination of N, M, u0, u1, v0, v1.
 
  • Like
Likes FranzS
  • #4
First of all, thank you so much @andrewkirk for taking your time to study this "obsession" of mine.

I'd like to update this thread with some further research I've done lately, in case anyone is curious about it.
I'm analyzing a simpler case, for which I can get a correct solution when ##M=2## but not when ##M=3## (or greater).Introduction

In order to get some insights about this optimization problem, I've been analyzing a "much simpler" case: instead of finished rings machined from raw tubes, I considered finished rods machined from raw solid (round) bars. Thus, the machining now consists in "peeling off" material from the raw bar to get the finished diameter of the rod (in other words, it's about reducing the raw bar diameter and considering the scrap material formed during this operation).

So, I want to machine rods with ##N## different diameters ##D##:
$$ D_1, D_2, ..., D_i, ..., D_N $$
I don't want to purchase raw bars with ##N## different diameters (one for each of the finished rods), but I want to restrict the number of raw sizes to ##M## (with ##M<N##), and they will be used for machining all the ##N## finished rods.

Please note that the largest raw size (diameter) will always be equal to ##D_N##, because it must allow me to machine (or better, to have it ready as it is***) the finished rod size ##D_N## itself.
*** Not really true, since all diameters ##D_i## actually include the machining allowance.

The diameters are listed/indexed so that ##D_i < D_{i+i} \; \forall i##.
Also, let's suppose that an explicit relation can be found between the index ##i## and the corresponding finished rod diameter ##D_i##. In other words, there exists a linear function ## f(i) = \beta_0 + \beta_1 \cdot i \; ## such that ##\; D_i \approx f(i)##.
Better still, let's suppose that there exists a quadratic function ## g(i) = \alpha_0 + \alpha_1 \cdot i + \alpha_2 \cdot i^2 \; ## such that ##\; {D_i}^2 \approx g(i)##.

Now, given a certain raw size ##D_j## and a certain finished size ##D_i## to be machined (of course with ##D_i \leq D_j##), the amount of scrap material (only considering the radial area removed, and neglecting the common constant factors ##\pi / 4##) is simply going to be:
$$ S_i = {D_j}^2 - {D_i}^2 \approx g(j)-g(i) $$

Case with ##M=2##

With ##M=2## raw sizes (namely, diameters ##D_j## and ##D_N##), I will be able to machine:
- all finished diameters ##D_i \leq D_j## using raw size ##D_j##
- all finished diameters ##D_{j+1} \leq D_i \leq D_N ## using raw size ##D_N##


The total amount of scrap material is then:
$$ S \; = \; \sum S_i \; = \; \sum_{i=1}^j {D_j}^2 - {D_i}^2 \; + \; \sum_{i=j+1}^N {D_N}^2 - {D_i}^2 \; = $$
$$ = \; j \cdot {D_j}^2 \; + (N-j) \cdot {D_N}^2 \; - \sum_{i=1}^N {D_i}^2 \; =$$
$$ = \; j \cdot {D_j}^2 \; - j \cdot {D_N}^2 \; + N \cdot {D_N}^2 \; - \sum_{i=1}^N {D_i}^2 $$
So, with our quadratic approximation ##g(j)## we have:
$$ S \; \approx \; j \cdot g(j) \; - j \cdot {D_N}^2 \; + N \cdot {D_N}^2 \; - \sum_{i=1}^N {D_i}^2
\; = j \cdot \left( \alpha_0 + \alpha_1 \cdot j + \alpha_2 \cdot j^2 \right) \; - j \cdot {D_N}^2 \; + \Delta_0
$$
where all constant terms (i.e. not related to ##j##) were grouped into the quantity ##\Delta_0##.

In order to minimize the total amount of scrap material ##S##, it's now time to differentiate the expression for ##S## with respect to ##j## and set it equal to zero:
$$
\frac{dS}{dj} = 0
\;
\;
\iff
\;
\;
3 \alpha_2 \cdot j^2 \; + \; 2 \alpha_1 \cdot j \; + \; \alpha_0 - {D_N}^2 \; = \; 0
$$
This is a standard quadratic equation in the variable ##j##, which we can solve with the quadratic formula.
The acceptable solution is the non-negative one, which is:
$$
j \; = \;
\frac{- \alpha_1+\sqrt{{\alpha_1}^2 - 3 \alpha_2 \cdot \left( \alpha_0 - {D_N}^2 \right)}}{3 \alpha_2}
$$

Specific example with ##M=2##

Let's suppose my ##N=19## finished diameters are:
$$D_i = 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23$$
The units of measurement are not relevant, of course.
I chose this data set with "trivial" intervals (##1## between each) so that I have the exact equality ##D_j=g(j)=j^2+8j+16##.
If I solve the quadratic equation as per the procedure explained above, I get the solution ##j \approx 10.68##.
If I calculate by brute force the amount of scrap material for any scenario, I indeed get the minimum amount for ##j = 11## (I run the brute-force method on all integer raw sizes ##D_j## from ##5## to ##22##).

This was pretty simple and – apparently – it will always work flawlessly.Case with ##M=3##

With ##M=3## raw sizes (namely, diameters ##D_j##, ##D_k## and ##D_N##), I will be able to machine:
- all finished diameters ##D_i \leq D_j## using raw size ##D_j##
- all finished diameters ##D_{j+1} \leq D_i \leq D_k ## using raw size ##D_k##
- all finished diameters ##D_{k+1} < D_i \leq D_N ## using raw size ##D_N##


The total amount of scrap material is then:
$$
S
\;
=
\;
\sum_{i=1}^j {D_j}^2 - {D_i}^2
\;
+
\;
\sum_{i=j+1}^k {D_k}^2 - {D_i}^2
\;
+
\;
\sum_{i=k+1}^N {D_N}^2 - {D_i}^2
\;
=
$$
$$
=
\;
j \cdot {D_j}^2
\;
+
\;
(k-j) \cdot {D_k}^2
\;
+
\;
(N-k) \cdot {D_N}^2
\;
-
\;
\sum_{i=1}^N {D_i}^2
\;
=
$$
$$
=
\;
j \cdot {D_j}^2
\;
+
\;
k \cdot {D_j}^2
\;
-
\;
j \cdot {D_k}^2
\;
-
\;
k \cdot {D_N}^2
\;
+
\;
\Delta_0
$$
where all constant terms (i.e. not related to ##j## or ##k##) were grouped into the quantity ##\Delta_0##.
So, with our quadratic approximations ##g(j)## and ##g(k)## we have (after a few expansions and simplifications):
$$
S
\;
=
\;
\alpha_1 \cdot j^2
+
\alpha_2 \cdot j^3
+
\alpha_0 \cdot k
+
\alpha_1 \cdot k^2
+
\alpha_2 \cdot k^3
-
\alpha_1 \cdot j \cdot k
-
\alpha_2 \cdot j \cdot k^2
-
{D_N}^2 \cdot k
$$
By the way, I started sensing something wrong here. Namely, how is the requirement ##j<k## respected with such formulation? Also, the form of ##g(j)## and ##g(k)## is exactly the same but I feel something must be wrong.
However, I went on with the calculation, also taking comfort from some cancellations in the final expression for ##S## above (two opposite ##\alpha_0 \cdot j## terms had cancelled out), which – in my mind – "broke the symmetry" between ##j## and ##k## and could tame my doubts.

So, in order to minimize the total amount of scrap material ##S##, it's now time to differentiate (partial differentiation) the expression for ##S## with respect to both ##j## and ##k## and set the two resulting expressions equal to zero, and finally solve the system:
$$
\frac{\partial S}{\partial j}=0
\;
\iff
\;
3 \alpha_2 \cdot j^2
+
2 \alpha_1 \cdot j
-
\alpha_2 \cdot k^2
-
\alpha_1 \cdot k
\;
=
\;
0
$$
$$
\frac{\partial S}{\partial k}=0
\;
\iff
\;
3 \alpha_2 \cdot k^2
+
2 \alpha_1 \cdot k
-
2 \alpha_2 \cdot j \cdot k
-
\alpha_1 \cdot j
+
\alpha_o
-
{D_N}^2
\;
=
\;
0
$$
Now, this system leads to a quartic equation, which is still solvable in exact form. But I'm not stubborn or patient enough to use the quartic formula.
I think a viable solution would be to plot the curves of the two implicit equations in a "##j##-##k## diagram" and check for the intersection point where both ##j## and ##k## are positive.Specific example with ##M=3##

Again, let's suppose my ##N=19## finished diameters are:
$$D_i = 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23$$
Again, I have the exact equality ##D_i=g(i)=i^2+8i+16##.

If I plot (courtesy of Desmos.com) the the two implicit equations that I get following the procedure above, I get the following graph:

sistema j k.png

So, apparently, ##j \approx 7## and ##k \approx 13## are the solutions.

The correct solution that minimizes the amount of scrap material, calculated by brute force, is instead ##j=8## and ##k=14##. Close enough but not the same.
Hereafter is a graph of ##S## (total amount of scrap material, calculated by brute-force) vs. all combinations ##j##-##k##-##N## (from left to right on the "##x##" axis, combinations are "1-2-19", "1-3-19", ... , "1-18-19", "2-3-19", "2-4-19", ... , etc. etc. , ... , "16-17-19", "16-18-19", "17-18-19").

Senza titolo.png
CONCLUSIONS:

Were was the mistake in the approach used for ##M=3##?
As already said, I guess the formulation of the problem does not correctly take into account the requirement ##j<k##. However, the theoretical (wrong) solution respects this requirement.
Also, for the optimization problems I'm used to study (polynomial regression), I always am in the situation where taking partial derivatives and setting them equal to zero is legit. Is it legit here?

Again, thanks A LOT to everyone who's willing to chime in.
Have a nice day!
 
Last edited:
  • #5
Your formulas look correct, excluding the typo in the one containing ##\Delta_0##.
I think the error lies in your solving of the equations. Perhaps you coded something incorrectly.
When I solve for the intersection of the two curving lines I get j = 7.65, k = 13.72. The nearest grid point to that is (8, 14), which matches your brute force solution.

Here's the R code that generated my results:
Code:
N = 19
alpha2 = 1
alpha1 = 8
alpha0 = 16

g <- function(i)
  alpha2 * i^2 + alpha1 * i + alpha0
DNsq <- g(N)
S <- function(j, k)
  alpha1 * j^2 + alpha2 * j^3 +
  alpha1 * k^2 + alpha2 * k^3  -
  alpha1 * j * k -
  alpha2 * j * k^2 + alpha0 * k -
  k * DNsq 

while(length(dev.list()) > 0)
  dev.off()

# have a look at the scrap matrix near the expected optimal point
j = 6:9
k = 12:15
x <- outer(j, k, S)
dimnames(x) = list(j = j, k = k)
print(x)

# have another look, more zoomed in
j0 = 7.65
k0 = 13.72
j = j0 + 0.01 * seq(-2, 2)
k = k0 + 0.01 * seq(-2, 2)
x <- outer(j, k, S)
dimnames(x) = list(j = j, k = k)
print(x)

# Generate data for a 3D plot
j <- seq(6, 9, length.out = 150)
k <- seq(12, 15, length.out = 150)
s <- outer(j, k, S)

# Create a 3D plot
persp(j, k, s, theta = 30, phi = 30, expand = 0.5, col = "lightblue") ###########################

# Now create a contour plot
dev.new()
contour(j, k, s) ###########################
grid(nx = 32)

# Now plot the lines where partial derivatives are zero
dev.new()
j1 <- k1 <- seq(0, 25, length.out = 20)
Sj <- function(j, k)
  2 * alpha1 * j + 3 * alpha2 * j^2 - k * alpha1 - k^2 * alpha2
Sk <- function(j, k)
  2 * alpha1 * k + 3 * alpha2 * k^2 - j * alpha1 - 2 * k * j * alpha2 + alpha0 - DNsq
sk <- outer(j1, k1, Sk)
# draw first line
contour(j1, k1, sk, level = 0, col = 'blue')
sj <- outer(j1, k1, Sj)
grid(nx = 27) ###########################
# draw 2nd line
contour(j1, k1, sj, level = 0, add = T, col = 'red')

Here are scrap values at integer grid points near the solution. Note that (8, 14) is best.
Code:
   k
j      12    13    14    15
  6 -4212 -4254 -4214 -4086
  7 -4221 -4296 -4291 -4200
  8 -4172 -4280 -4310 -4256
  9 -4059 -4200 -4265 -4248
Here are scrap values zoomed in near the intersection point:
Code:
   k
j           13.7     13.71     13.72     13.73     13.74
  7.63 -4313.620 -4313.624 -4313.620 -4313.607 -4313.586
  7.64 -4313.622 -4313.630 -4313.629 -4313.620 -4313.603
  7.65 -4313.618 -4313.630 -4313.633 -4313.627 -4313.614
  7.66 -4313.609 -4313.623 -4313.630 -4313.628 -4313.618
  7.67 -4313.592 -4313.611 -4313.621 -4313.623 -4313.616
Here is a contour graph of the scrap function:
1698307514453.png


and here is a graph of the intersection point:
1698307556736.png
 
  • #6
Dear @andrewkirk , thanks a lot. I'll double check everything later.
 
  • #7
I've checked and I think I had Indeed written something wrong into Desmos, the equations are correct.
Again, thank you @andrewkirk for you super valuable review
 

Similar threads

Replies
1
Views
489
  • General Math
Replies
9
Views
1K
  • Computing and Technology
Replies
4
Views
845
Replies
1
Views
1K
  • General Math
Replies
6
Views
1K
  • General Math
Replies
8
Views
1K
Replies
4
Views
1K
Replies
3
Views
819
  • Electrical Engineering
Replies
0
Views
403
  • Calculus and Beyond Homework Help
Replies
1
Views
578
Back
Top