How to optimise something when multiple parameters change the output

In summary, the conversation discusses the dilemma of optimizing multiple parameters in engineering problems, which leads to the difficulty of finding the best overall solution. The speaker mentions three reasons for this dilemma and asks if there is a standard method or solution. The examples of designing a neural network and a patch antenna are given, and the concept of optimization in large-scale problems is explained. Several algorithms for optimization are mentioned, including BFGS, DFP, Nelder-Mead, and simulated annealing, and the idea of using heuristic algorithms is brought up. The speaker also mentions using a Genetic Algorithm to train a neural network, which leads to the question of using the same algorithm for optimization.
  • #1
CraigH
222
1
This dilemma seems to occurs all the time in so many different engineering problems. It seems impossible to optimise something with multiple parameters, for three reasons:

  • There are too many possible combinations of these parameters to be able to simulate them all
  • You optimise one parameter at a time, but then you don't know if your final result is the best possible result. E.g: You start with a neural network with 5 layers, and 4 neurons in each layer. You perform a sweep on the number of layers in the neural network, plotting the accuracy of the network vs number of layers. You find that the optimum number is 3 Layers. You then optimise the number of neurons in the first layer, and find that the optimum is 10 neurons, then you optimise the number in the second layer and find 7 is the best, and then the number in the third and find that 6 is the best. However, this might not be the best overall solution: For example the accuracy of the network might be better if you start by using 3 neurons in the first layer, which is not the optimum, and then optimise the number in the second and third layer, finding that you get a different number of neurons but a much better accuracy compared to the first method.
  • You optimise parameter 1 that governs property X, and then you optimise parameter 2 that governs property Y, but then this has changed property X, so you go back and optimise the parameter 1, but then this changes property Y.
This seems like a very fundamental problem when designing any system in any area of engineering, and I thought that there may be a standard method of approaching this problem, or at least a few known methods that do a pretty good job. If there is a solution to this dilemma, can somebody please tell me?

Additional Details

The neural network example is the problem I am currently having, but I'll give another example of this problem I have had in the past. I was trying to design a patch antenna that has a resonant frequency of 2GHz. The resonant frequency is mainly dependant on the width of the patch, and the gain of the antenna was mainly dependant on the insertion depth. In CST, I performed a sweep on the width and picked the value that gave the lowest s parameter at 2GHz, I then performed a sweep on the insertion depth and picked the value that reduces the S parameter to the lowest value (I decided that -40dB was acceptable). But this then changed the frequency at which this gain happens, so I did a sweep on the width again, and I picked the value that gave the lowest s parameter, but now the s parameter is too high again, so I optimise the insertion depth... and so on.
 
Engineering news on Phys.org
  • #2
I don't know much about neural networks, but I would guess selecting a good topology for a network is a practical problem that would be covered in books on the subject.

For "large scale" optimization problems, the objective is usually to get a solution that is "good enough" rather than try to fund the global optimum solution. There can be many "local" optimum solutions with not much to choose between them.

If you have n variables to optimize, you can consider each set of n values as a geometrical point in n-dimensional space. Visualizing how algorithms work is easy when n = 2 (e.g. draw something that looks like a contour map, and the optimum is the highest or lowest point on the map). When n > 3, drawing pictures is hard, but the math works the same way for any value of n.

In general "optimizing one variable at a time" is very inefficient for the reason you discovered: the variables interact with each other. When n = 2, this is like trying to get to the top of a mountain by only moving north/south or east/west. If the mountain is a long "ridge" running from northeast to southwest, for example, that is obviously a bad plan.

Better methods attempt to find "search directions" which are linear combinations of the variables, such that the interaction between the directions is small (and ideally zero).

There are several well-known algorithms for this. My personal favorite is BFGS (named from the four people who invented it). Another popular one is DFP (The "F" is the same guy in each).

A different approach is to try to find a region that contains the optimum, and then subdivide it into smaller regions. One version of this the Nelder-Mead algorithm.

Yet another way is simply to try points "at random", and keep track of the best solutions. Then try new points "close" to the best solutions you have found so far. One version of this is called "simulated annealing".

If you can define the function you want to optimize mathematically, all these algorithms are in systems like Matlab. If you need another software package like CST to find how "good" a particular design is, you can usually automate the process by making the optimization algorithm create the input file to run CST, run the model, and then extract the relevant data from the CST output file.

Finding some course notes or a textbook on optimization is probably a better way to learn more than googling for the individual methods.
 
Last edited:
  • Like
Likes 1 person
  • #3
Thank you for your answer! After a bit of googling and asking around I have read about the algorithms you mentioned a few times. I believe they are called heuristic algorithms? I was actually told about these at the start of my project, but I assumed that I was supposed to use these as an alternative method to train the neural network. I have implemented the Genetic Algorithm to train the neural network, and I'm now trying to optimise the network topology so that it trains better. It seems strange using the Genetic Algorithm to optimise something that will be using the Genetic Algorithm to optimise something else, but I suppose it makes sense.
As for books on neural network topology most resources I have seen suggest to only use one layer, as it has been proved that a single layer neural network is a universal approximator. This makes it easier to optimise the number of neurons as you now only have one parameter to optimize. However I had a meeting with a PHD student a few days ago who specialises in neural networks and he told me to stick with 3 hidden layers. he says he guarantees 3 hidden layers is the best for the particular problem I'm working on. (which reminds me, I need to email him and ask why he said that...)
 
Last edited:
  • #4
CraigH said:
(which reminds me, I need to email him and ask why he said that...)

One of my early (and cynical) mentors in industry explained it like this: When you start work in a new field, you don't know anything. So you ask advice from three people and you probably get three different answers, and you don't know which is right.

But after a while, you realize that most people have personal prejudices about the "best" way to do things.

Eventually, you get some prejudices of your own, and then it is obvious that people who share your prejudices are right and the others are wrong :biggrin:
 
  • #5
*Other books do mention optimising the topology for networks with more than one layer, but I have yet to find one which gives an exact procedure to follow that will find the optimum topology. They just mention rules of thumb, and tips, (such as use more neurons in the first few layers to create more features for subsequent layers to work with) Thats what this question was all about, methods to find the best possible solution to a problem that can't be solved by optimising one variable.

I also just wanted to know that I wasn't alone in having this problem. In class we are given projects such as "design a narrow band high gain 2GHz patch antenna", and we are taught all the science relating to the project, but never taught how to exactly go about designing it. It just frustrated me when I had to submit a design that might not have been the optimal design.
 

1. How do I identify the most important parameters to optimize?

The first step in optimizing something with multiple parameters is to identify which parameters have the most significant impact on the output. This can be done through sensitivity analysis or by using statistical methods such as regression analysis.

2. How do I determine the optimal values for each parameter?

Once the important parameters have been identified, the next step is to determine the optimal values for each of them. This can be achieved through techniques such as experimentation, simulation, or optimization algorithms.

3. What is the trade-off between optimizing multiple parameters?

In many cases, optimizing one parameter may lead to a decrease in the output of another parameter. This is known as a trade-off. It is important to consider these trade-offs when optimizing multiple parameters to ensure the overall optimization is beneficial.

4. How do I handle conflicting objectives when optimizing multiple parameters?

In some cases, there may be conflicting objectives when optimizing multiple parameters. This means that improving one parameter may have a negative impact on another. To handle this, it is important to establish clear priorities and trade-offs between the different objectives.

5. How do I validate the optimized values for each parameter?

After determining the optimal values for each parameter, it is important to validate these values to ensure they are accurate. This can be done through testing, sensitivity analysis, or by comparing the optimized values to existing data or literature values.

Similar threads

  • Programming and Computer Science
Replies
1
Views
2K
  • Mechanical Engineering
Replies
1
Views
3K
Replies
62
Views
6K
  • STEM Academic Advising
Replies
13
Views
2K
Replies
10
Views
2K
Replies
2
Views
6K
Replies
2
Views
4K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
2K
Back
Top