Multilayer perceptron questions

  • Thread starter Thread starter SotirisD
  • Start date Start date
Click For Summary
SUMMARY

This discussion focuses on the configuration of multilayer perceptrons (MLPs) in Matlab, specifically using adaptive learning with momentum backpropagation (traingdx). The user reports that lower initial learning rates yield better accuracy, while higher rates lead to a dramatic drop in model performance. Additionally, the choice of output activation functions significantly impacts model results, with specific combinations like {'tansig', 'tansig', 'purelin'} performing well, while {'tansig', 'tansig', 'logsig'} fails due to the zeroing effect of the logsig function on negative values. Insights from Sutskever's guest post on learning rates are also referenced.

PREREQUISITES
  • Understanding of multilayer perceptrons (MLPs)
  • Familiarity with Matlab and its neural network toolbox
  • Knowledge of adaptive learning techniques, specifically momentum backpropagation (traingdx)
  • Comprehension of activation functions, particularly {'tansig', 'purelin', 'logsig'}
NEXT STEPS
  • Research the effects of different learning rates on MLP performance
  • Explore various output activation functions and their impact on neural network accuracy
  • Study Sutskever's insights on learning rates in deep learning
  • Experiment with different configurations of multilayer perceptrons in Matlab
USEFUL FOR

Machine learning practitioners, data scientists, and researchers experimenting with neural networks, particularly those using Matlab for image classification tasks.

SotirisD
Messages
1
Reaction score
0
So I am experimenting with different configurations of multilayer perceptrons in Matlab and my training data are extracted from images which I want to classify.

-I am currently using adaptive learning with momentum backpropagation (traingdx) setting different initial learning rates.What I get is that for low values I have a pretty good results but when the initial rate gets bigger the accuracy of my model drops dramatically.How can this be explained?

-Another question I have is how different output activation functions can affect your model.Are there some heuristics for this or just trial and error? For example I get good results with {'tansig', 'tansig', 'purelin'}, {'tansig', 'tansig', 'tansig'} but {'tansig', 'tansig', 'logsig'} fails, I suspect it has to do with negative values getting zeroed by logsig.
 
There is a discussion of learning rates in Sutskever's guest post on Yisong Yue's blog: http://yyue.blogspot.sg/2015/01/a-brief-overview-of-deep-learning.html
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
6K
Replies
1
Views
2K
  • · Replies 25 ·
Replies
25
Views
2K
  • · Replies 17 ·
Replies
17
Views
4K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 27 ·
Replies
27
Views
4K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K