Why l1 Norm is non-differentiable?

  • Thread starter Thread starter venki1130
  • Start date Start date
  • Tags Tags
    Norm
Click For Summary
SUMMARY

The l1 norm, defined as the sum of absolute values |x|, is non-differentiable at points where any component x_i equals zero, due to the kink in the absolute value function. This non-differentiability is particularly relevant in matrix calculus, where the l1 norm is often used in optimization problems. The distinction between l1 (little ell one) and L1 (integral version) norms is crucial, as they serve different purposes in mathematical contexts. Understanding the geometric interpretation of the l1 norm as a taxicab metric further clarifies its properties and implications in optimization.

PREREQUISITES
  • Understanding of matrix calculus
  • Familiarity with l1 and L1 norms
  • Knowledge of optimization techniques
  • Basic concepts of differentiability
NEXT STEPS
  • Study the geometric interpretation of l1 and L1 norms
  • Learn about differentiability in the context of convex functions
  • Explore matrix calculus applications in optimization problems
  • Investigate the implications of non-differentiability in machine learning algorithms
USEFUL FOR

Mathematicians, data scientists, and machine learning practitioners who are working with optimization problems involving l1 norms and require a deeper understanding of their properties and applications in matrix calculus.

venki1130
Messages
3
Reaction score
0
Can anyone explain Why l1 Norm is non-differentiable in terms of matrix calculus ?
 
Physics news on Phys.org
Because L1 norm is based on the absolute value of the difference, and absolute value |x| has a kink at x=0. It is not differentiable at the kink.
 
venki1130 said:
Can anyone explain Why l1 Norm is non-differentiable in terms of matrix calculus ?

I believe venki1130 may have answered your question, but I am personally not sure. When you say l1 norm, do you mean norm of ##(x_1,\dots,x_n)## is ##|x_1|+\cdots+|x_n|##? That is the first definition I found on wikipedia. I believe this is also called the taxicab metric.

If I try to recall my education, ##\ell1## and ##L1## are different, the first one is called little ell one. The second I believe is the integral version, ##|f(x)|_1=\int|f(x)|dx##. Compare to ##L2##, ##|f(x)|_2=(\int|f(x)|^2dx)^{1/2}##. Little ell two, is ##|(x_1,\dots,x_n)|_2=\sqrt{x_1^1+\cdots+x_n^2}##. This is sort of a distance as the crow flies, as opposed to how a taxi drives.

I believe the ##\ell2##-norm has a familiar representation as a matrix, so that is what is confusing me. You asked for a matrix definition of ##\ell1##-norm, when I only know of one for ##\ell2##-norm.

Further, I could not tell you quickly how to use the matrix representation to show you the norm is not differentiable. I would guess that venki1130 pointed you in the right direction. In general, you could show it is not differentiable along any ##x_i=0## face. It would be easiest to check for ##x_2=\cdots=x_n=0##, and ##x_1## near 0. In other words, show ##|x_1|## is not differentiable near zero. Simply care the slopes from the left and right of 0.
 
algebrat said:
I believe venki1130 may have answered your question, but I am personally not sure. When you say l1 norm, do you mean norm of ##(x_1,\dots,x_n)## is ##|x_1|+\cdots+|x_n|##? That is the first definition I found on wikipedia. I believe this is also called the taxicab metric.

If I try to recall my education, ##\ell1## and ##L1## are different, the first one is called little ell one. The second I believe is the integral version, ##|f(x)|_1=\int|f(x)|dx##. Compare to ##L2##, ##|f(x)|_2=(\int|f(x)|^2dx)^{1/2}##. Little ell two, is ##|(x_1,\dots,x_n)|_2=\sqrt{x_1^1+\cdots+x_n^2}##. This is sort of a distance as the crow flies, as opposed to how a taxi drives.

I believe the ##\ell2##-norm has a familiar representation as a matrix, so that is what is confusing me. You asked for a matrix definition of ##\ell1##-norm, when I only know of one for ##\ell2##-norm.

Further, I could not tell you quickly how to use the matrix representation to show you the norm is not differentiable. I would guess that venki1130 pointed you in the right direction. In general, you could show it is not differentiable along any ##x_i=0## face. It would be easiest to check for ##x_2=\cdots=x_n=0##, and ##x_1## near 0. In other words, show ##|x_1|## is not differentiable near zero. Simply care the slopes from the left and right of 0.

I think he means this: http://en.wikipedia.org/wiki/Matrix_norm
 
Thank you very much micromass, algebrat and Chogg for your response. I and using L1 norm in the optimization problem.

for example: For least squares optimization using L2 norm for regularization the equation I am using is

min ||Ax-b||22 + λ||x||22

Calculating first derivative(using matrix calculus) and equating it to zero results

x= At(AtA+λI)-1b

similarly for L1 norm

min ||Ax-b||22 + λ||x||1

But, People always say it is non differentiable. In fact, I understand the concept (intuitively, the unit circle in l1 has the sharp corner where the function doesn't change so there is no derivative for it) but I want to learn step by step using matrix derivatives.

Again I thank you very much for your help.

Venki
 
Relativistic Momentum, Mass, and Energy Momentum and mass (...), the classic equations for conserving momentum and energy are not adequate for the analysis of high-speed collisions. (...) The momentum of a particle moving with velocity ##v## is given by $$p=\cfrac{mv}{\sqrt{1-(v^2/c^2)}}\qquad{R-10}$$ ENERGY In relativistic mechanics, as in classic mechanics, the net force on a particle is equal to the time rate of change of the momentum of the particle. Considering one-dimensional...

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
963
  • · Replies 49 ·
2
Replies
49
Views
7K
  • · Replies 2 ·
Replies
2
Views
7K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 7 ·
Replies
7
Views
5K