Why l1 Norm is non-differentiable?

  • Context: Graduate 
  • Thread starter Thread starter venki1130
  • Start date Start date
  • Tags Tags
    Norm
Click For Summary

Discussion Overview

The discussion centers on the non-differentiability of the l1 norm in the context of matrix calculus, exploring its implications for optimization problems, particularly in relation to least squares optimization.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants explain that the l1 norm is based on the absolute value function, which has a kink at zero, making it non-differentiable at that point.
  • There is a discussion about the distinction between l1 and L1 norms, with some participants noting that l1 refers to the sum of absolute values while L1 may refer to an integral version.
  • One participant expresses uncertainty about how to represent the l1 norm in matrix form and how to demonstrate its non-differentiability using matrix calculus.
  • A participant mentions using the l1 norm in an optimization problem and seeks a step-by-step understanding of its non-differentiability in that context.
  • Another participant suggests that the non-differentiability can be shown along any face where one of the variables equals zero, specifically noting the case of checking slopes around zero.

Areas of Agreement / Disagreement

Participants generally agree on the non-differentiability of the l1 norm at certain points, particularly at zero, but there is no consensus on the best way to demonstrate this using matrix calculus. Multiple perspectives on the definitions and implications of l1 and L1 norms are present.

Contextual Notes

Some participants express uncertainty about the definitions and representations of the l1 and L1 norms, indicating potential limitations in understanding their mathematical properties and applications.

venki1130
Messages
3
Reaction score
0
Can anyone explain Why l1 Norm is non-differentiable in terms of matrix calculus ?
 
Physics news on Phys.org
Because L1 norm is based on the absolute value of the difference, and absolute value |x| has a kink at x=0. It is not differentiable at the kink.
 
venki1130 said:
Can anyone explain Why l1 Norm is non-differentiable in terms of matrix calculus ?

I believe venki1130 may have answered your question, but I am personally not sure. When you say l1 norm, do you mean norm of ##(x_1,\dots,x_n)## is ##|x_1|+\cdots+|x_n|##? That is the first definition I found on wikipedia. I believe this is also called the taxicab metric.

If I try to recall my education, ##\ell1## and ##L1## are different, the first one is called little ell one. The second I believe is the integral version, ##|f(x)|_1=\int|f(x)|dx##. Compare to ##L2##, ##|f(x)|_2=(\int|f(x)|^2dx)^{1/2}##. Little ell two, is ##|(x_1,\dots,x_n)|_2=\sqrt{x_1^1+\cdots+x_n^2}##. This is sort of a distance as the crow flies, as opposed to how a taxi drives.

I believe the ##\ell2##-norm has a familiar representation as a matrix, so that is what is confusing me. You asked for a matrix definition of ##\ell1##-norm, when I only know of one for ##\ell2##-norm.

Further, I could not tell you quickly how to use the matrix representation to show you the norm is not differentiable. I would guess that venki1130 pointed you in the right direction. In general, you could show it is not differentiable along any ##x_i=0## face. It would be easiest to check for ##x_2=\cdots=x_n=0##, and ##x_1## near 0. In other words, show ##|x_1|## is not differentiable near zero. Simply care the slopes from the left and right of 0.
 
algebrat said:
I believe venki1130 may have answered your question, but I am personally not sure. When you say l1 norm, do you mean norm of ##(x_1,\dots,x_n)## is ##|x_1|+\cdots+|x_n|##? That is the first definition I found on wikipedia. I believe this is also called the taxicab metric.

If I try to recall my education, ##\ell1## and ##L1## are different, the first one is called little ell one. The second I believe is the integral version, ##|f(x)|_1=\int|f(x)|dx##. Compare to ##L2##, ##|f(x)|_2=(\int|f(x)|^2dx)^{1/2}##. Little ell two, is ##|(x_1,\dots,x_n)|_2=\sqrt{x_1^1+\cdots+x_n^2}##. This is sort of a distance as the crow flies, as opposed to how a taxi drives.

I believe the ##\ell2##-norm has a familiar representation as a matrix, so that is what is confusing me. You asked for a matrix definition of ##\ell1##-norm, when I only know of one for ##\ell2##-norm.

Further, I could not tell you quickly how to use the matrix representation to show you the norm is not differentiable. I would guess that venki1130 pointed you in the right direction. In general, you could show it is not differentiable along any ##x_i=0## face. It would be easiest to check for ##x_2=\cdots=x_n=0##, and ##x_1## near 0. In other words, show ##|x_1|## is not differentiable near zero. Simply care the slopes from the left and right of 0.

I think he means this: http://en.wikipedia.org/wiki/Matrix_norm
 
Thank you very much micromass, algebrat and Chogg for your response. I and using L1 norm in the optimization problem.

for example: For least squares optimization using L2 norm for regularization the equation I am using is

min ||Ax-b||22 + λ||x||22

Calculating first derivative(using matrix calculus) and equating it to zero results

x= At(AtA+λI)-1b

similarly for L1 norm

min ||Ax-b||22 + λ||x||1

But, People always say it is non differentiable. In fact, I understand the concept (intuitively, the unit circle in l1 has the sharp corner where the function doesn't change so there is no derivative for it) but I want to learn step by step using matrix derivatives.

Again I thank you very much for your help.

Venki
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 49 ·
2
Replies
49
Views
8K
  • · Replies 2 ·
Replies
2
Views
7K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 7 ·
Replies
7
Views
6K