venki1130
- 3
- 0
Can anyone explain Why l1 Norm is non-differentiable in terms of matrix calculus ?
The l1 norm, defined as the sum of absolute values |x|, is non-differentiable at points where any component x_i equals zero, due to the kink in the absolute value function. This non-differentiability is particularly relevant in matrix calculus, where the l1 norm is often used in optimization problems. The distinction between l1 (little ell one) and L1 (integral version) norms is crucial, as they serve different purposes in mathematical contexts. Understanding the geometric interpretation of the l1 norm as a taxicab metric further clarifies its properties and implications in optimization.
PREREQUISITESMathematicians, data scientists, and machine learning practitioners who are working with optimization problems involving l1 norms and require a deeper understanding of their properties and applications in matrix calculus.
venki1130 said:Can anyone explain Why l1 Norm is non-differentiable in terms of matrix calculus ?
algebrat said:I believe venki1130 may have answered your question, but I am personally not sure. When you say l1 norm, do you mean norm of ##(x_1,\dots,x_n)## is ##|x_1|+\cdots+|x_n|##? That is the first definition I found on wikipedia. I believe this is also called the taxicab metric.
If I try to recall my education, ##\ell1## and ##L1## are different, the first one is called little ell one. The second I believe is the integral version, ##|f(x)|_1=\int|f(x)|dx##. Compare to ##L2##, ##|f(x)|_2=(\int|f(x)|^2dx)^{1/2}##. Little ell two, is ##|(x_1,\dots,x_n)|_2=\sqrt{x_1^1+\cdots+x_n^2}##. This is sort of a distance as the crow flies, as opposed to how a taxi drives.
I believe the ##\ell2##-norm has a familiar representation as a matrix, so that is what is confusing me. You asked for a matrix definition of ##\ell1##-norm, when I only know of one for ##\ell2##-norm.
Further, I could not tell you quickly how to use the matrix representation to show you the norm is not differentiable. I would guess that venki1130 pointed you in the right direction. In general, you could show it is not differentiable along any ##x_i=0## face. It would be easiest to check for ##x_2=\cdots=x_n=0##, and ##x_1## near 0. In other words, show ##|x_1|## is not differentiable near zero. Simply care the slopes from the left and right of 0.