How to optimise something when multiple parameters change the output

Click For Summary

Discussion Overview

The discussion revolves around the challenges of optimizing systems with multiple parameters in engineering contexts, particularly focusing on neural networks and antenna design. Participants explore the complexities of parameter interactions and the difficulties in achieving optimal solutions when adjusting multiple variables simultaneously.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant describes the inherent difficulties in optimizing multiple parameters due to the vast number of combinations and the interdependencies between parameters, using neural networks and antenna design as examples.
  • Another participant suggests that for large-scale optimization, the goal is often to find a "good enough" solution rather than the global optimum, highlighting the existence of local optima.
  • Several optimization algorithms are mentioned, including BFGS, DFP, Nelder-Mead, and simulated annealing, with a focus on their ability to handle interactions between variables more effectively than optimizing one at a time.
  • A participant expresses confusion about using heuristic algorithms, such as Genetic Algorithms, for optimizing neural network topology, noting the complexity of applying such methods recursively.
  • There is mention of differing opinions on the optimal number of layers in neural networks, with one participant citing a recommendation for three hidden layers from a PhD student, while others suggest that a single layer can suffice as a universal approximator.
  • Concerns are raised about the lack of clear procedures in literature for optimizing neural network topologies beyond general rules of thumb, leading to frustration over potentially suboptimal designs in practical projects.

Areas of Agreement / Disagreement

Participants express a range of views on optimization strategies, with no consensus on a single best approach or solution. The discussion reflects differing opinions on neural network design and the effectiveness of various optimization algorithms.

Contextual Notes

Participants note limitations in existing literature regarding specific procedures for optimizing neural network topologies and the challenges of applying theoretical knowledge to practical design problems.

Who May Find This Useful

Readers interested in optimization techniques in engineering, particularly in neural networks and antenna design, may find the discussion relevant.

CraigH
Messages
221
Reaction score
1
This dilemma seems to occurs all the time in so many different engineering problems. It seems impossible to optimise something with multiple parameters, for three reasons:

  • There are too many possible combinations of these parameters to be able to simulate them all
  • You optimise one parameter at a time, but then you don't know if your final result is the best possible result. E.g: You start with a neural network with 5 layers, and 4 neurons in each layer. You perform a sweep on the number of layers in the neural network, plotting the accuracy of the network vs number of layers. You find that the optimum number is 3 Layers. You then optimise the number of neurons in the first layer, and find that the optimum is 10 neurons, then you optimise the number in the second layer and find 7 is the best, and then the number in the third and find that 6 is the best. However, this might not be the best overall solution: For example the accuracy of the network might be better if you start by using 3 neurons in the first layer, which is not the optimum, and then optimise the number in the second and third layer, finding that you get a different number of neurons but a much better accuracy compared to the first method.
  • You optimise parameter 1 that governs property X, and then you optimise parameter 2 that governs property Y, but then this has changed property X, so you go back and optimise the parameter 1, but then this changes property Y.
This seems like a very fundamental problem when designing any system in any area of engineering, and I thought that there may be a standard method of approaching this problem, or at least a few known methods that do a pretty good job. If there is a solution to this dilemma, can somebody please tell me?

Additional Details

The neural network example is the problem I am currently having, but I'll give another example of this problem I have had in the past. I was trying to design a patch antenna that has a resonant frequency of 2GHz. The resonant frequency is mainly dependent on the width of the patch, and the gain of the antenna was mainly dependent on the insertion depth. In CST, I performed a sweep on the width and picked the value that gave the lowest s parameter at 2GHz, I then performed a sweep on the insertion depth and picked the value that reduces the S parameter to the lowest value (I decided that -40dB was acceptable). But this then changed the frequency at which this gain happens, so I did a sweep on the width again, and I picked the value that gave the lowest s parameter, but now the s parameter is too high again, so I optimise the insertion depth... and so on.
 
Engineering news on Phys.org
I don't know much about neural networks, but I would guess selecting a good topology for a network is a practical problem that would be covered in books on the subject.

For "large scale" optimization problems, the objective is usually to get a solution that is "good enough" rather than try to fund the global optimum solution. There can be many "local" optimum solutions with not much to choose between them.

If you have n variables to optimize, you can consider each set of n values as a geometrical point in n-dimensional space. Visualizing how algorithms work is easy when n = 2 (e.g. draw something that looks like a contour map, and the optimum is the highest or lowest point on the map). When n > 3, drawing pictures is hard, but the math works the same way for any value of n.

In general "optimizing one variable at a time" is very inefficient for the reason you discovered: the variables interact with each other. When n = 2, this is like trying to get to the top of a mountain by only moving north/south or east/west. If the mountain is a long "ridge" running from northeast to southwest, for example, that is obviously a bad plan.

Better methods attempt to find "search directions" which are linear combinations of the variables, such that the interaction between the directions is small (and ideally zero).

There are several well-known algorithms for this. My personal favorite is BFGS (named from the four people who invented it). Another popular one is DFP (The "F" is the same guy in each).

A different approach is to try to find a region that contains the optimum, and then subdivide it into smaller regions. One version of this the Nelder-Mead algorithm.

Yet another way is simply to try points "at random", and keep track of the best solutions. Then try new points "close" to the best solutions you have found so far. One version of this is called "simulated annealing".

If you can define the function you want to optimize mathematically, all these algorithms are in systems like Matlab. If you need another software package like CST to find how "good" a particular design is, you can usually automate the process by making the optimization algorithm create the input file to run CST, run the model, and then extract the relevant data from the CST output file.

Finding some course notes or a textbook on optimization is probably a better way to learn more than googling for the individual methods.
 
Last edited:
  • Like
Likes   Reactions: 1 person
Thank you for your answer! After a bit of googling and asking around I have read about the algorithms you mentioned a few times. I believe they are called heuristic algorithms? I was actually told about these at the start of my project, but I assumed that I was supposed to use these as an alternative method to train the neural network. I have implemented the Genetic Algorithm to train the neural network, and I'm now trying to optimise the network topology so that it trains better. It seems strange using the Genetic Algorithm to optimise something that will be using the Genetic Algorithm to optimise something else, but I suppose it makes sense.
As for books on neural network topology most resources I have seen suggest to only use one layer, as it has been proved that a single layer neural network is a universal approximator. This makes it easier to optimise the number of neurons as you now only have one parameter to optimize. However I had a meeting with a PHD student a few days ago who specialises in neural networks and he told me to stick with 3 hidden layers. he says he guarantees 3 hidden layers is the best for the particular problem I'm working on. (which reminds me, I need to email him and ask why he said that...)
 
Last edited:
CraigH said:
(which reminds me, I need to email him and ask why he said that...)

One of my early (and cynical) mentors in industry explained it like this: When you start work in a new field, you don't know anything. So you ask advice from three people and you probably get three different answers, and you don't know which is right.

But after a while, you realize that most people have personal prejudices about the "best" way to do things.

Eventually, you get some prejudices of your own, and then it is obvious that people who share your prejudices are right and the others are wrong :biggrin:
 
*Other books do mention optimising the topology for networks with more than one layer, but I have yet to find one which gives an exact procedure to follow that will find the optimum topology. They just mention rules of thumb, and tips, (such as use more neurons in the first few layers to create more features for subsequent layers to work with) Thats what this question was all about, methods to find the best possible solution to a problem that can't be solved by optimising one variable.

I also just wanted to know that I wasn't alone in having this problem. In class we are given projects such as "design a narrow band high gain 2GHz patch antenna", and we are taught all the science relating to the project, but never taught how to exactly go about designing it. It just frustrated me when I had to submit a design that might not have been the optimal design.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 62 ·
3
Replies
62
Views
10K
  • · Replies 13 ·
Replies
13
Views
4K
Replies
10
Views
5K
  • · Replies 1 ·
Replies
1
Views
404
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 2 ·
Replies
2
Views
8K