# Help with this notation -- some sort of norm?

1. Mar 4, 2017

I need help understanding this notation, what does this mean?

Squared of 2-norm?

1. The problem statement, all variables and given/known data

Thanks

2. Mar 4, 2017

### StoneTemplePython

yes. Say $\mathbf b = \mathbf{y - Xw}$. I assume that b is real valued for this example. Say you want to minimize the euclidean length of $\mathbf b$. You write that as $min \big(\mathbf b^T \mathbf b\big)^\frac{1}{2}$. But square roots are unpleasant to work with, so you then recognize that if you minimize the squared euclidean length of $\mathbf b$ then that also must minimize the euclidean length of $\mathbf b$. (Why must this be the case?). Hence you recover the problem above that reads as: $min \big(\mathbf b^T \mathbf b\big)$.

What you have there is the setup for the Normal Equations, and doing ordinary least squares estimations. There are two approaches to deriving the solution for an over-determined system of equations -- one involves calculus and the other involves wielding orthogonality. Both approaches are worth understanding and thinking on.

3. Mar 5, 2017

I am learning Machine Learning by myself. I have BSEE but I am encountering many symbols/notations that I don't understand.

For example, what does the "1 with a vertical line through its back" mean?

I know as far as E stands for expected value.

Thanks

4. Mar 5, 2017

### StoneTemplePython

You'
My recommendation is to find a good text like "Learning From Data" and learn from that text (plus its echapters). (The book is quite cheap at $30 in the US, though due to peculiarities with licensing,$100 in Canada?) There is an associated free course with the same title at work.caltech.edu, and also on itunes store.

More to the point: a good text will have an appendix that lists and defines all the notation that it uses. Unfortunately notation is not standardized or uniform between authors.

A $\mathbf 1$ will tend to mean a ones vector or an indicator function or sometimes even the identity matrix. Here it is an indicator function. I personally prefer a $\mathbf 1$ to mean ones vector, $\mathbf I$ to mean identity matrix, and $\mathbb I(Y = 1)$ to denote an indicator function, but the fundamental issues is non-homogeneity of notation in the space --- again my solution is that you can homogenize things when starting off by picking one good source to learn from that has its own consistent notation. (Then once you've mastered that one source, you can much more easily infer / guess other people's notation as your branch out.)

Good luck.