Why is L1 norm harder to optimize than L2 norm?

  • Context: Undergrad 
  • Thread starter Thread starter pamparana
  • Start date Start date
  • Tags Tags
    L2 Norm
Click For Summary
SUMMARY

The discussion centers on the optimization challenges associated with L1 and L2 norms, specifically addressing why L1 norm optimization is considered more difficult. L2 norm benefits from a closed-form solution and a derivative that exists everywhere, making it easier to optimize. In contrast, L1 norm has a non-differentiable point at zero, complicating the optimization process despite being convex and continuous. Luca highlights that while L1 norm represents a single linear program, L2 norm involves a sequence of linear programs that effectively identifies the efficient frontier.

PREREQUISITES
  • Understanding of convex optimization principles
  • Familiarity with L1 and L2 norms in mathematical optimization
  • Knowledge of linear programming concepts
  • Basic calculus, particularly derivatives and their significance in optimization
NEXT STEPS
  • Study the properties and applications of L1 and L2 norms in optimization
  • Learn about convex optimization techniques and their implementations
  • Explore linear programming methods and their relation to norm minimization
  • Investigate gradient descent algorithms and their adaptations for non-differentiable functions
USEFUL FOR

Mathematicians, data scientists, and optimization engineers seeking to deepen their understanding of norm optimization challenges and techniques.

pamparana
Messages
123
Reaction score
0
Hi all,

I have a basic optimisation question. I keep reading that L2 norm is easier to optimise than L1 norm. I can see why L2 norm is easy as it will have a closed form solution as it has a derivative everywhere.

For the L1 norm, there is derivatiev everywhere except 0, right? Why is this such a problem with optimisation. I mean, there is a valid gradient everywhere else.

I am really having problems convincing myself why L1 norm is so much harder than l2 norm minimisation. L1 is convex and continupus as well and only has one point which does not have a derivative.

Any explanation would be greatly appreciated!

Thanks,

Luca
 
Physics news on Phys.org
If the problem is a linear program then the L1 norm is a single program while the L2 norm is a sequence of linear programs that finds the efficient frontier.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 16 ·
Replies
16
Views
5K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 26 ·
Replies
26
Views
4K
  • · Replies 53 ·
2
Replies
53
Views
7K
  • · Replies 124 ·
5
Replies
124
Views
20K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K