# Specific Sigma Notation

1. Aug 10, 2010

### dodo21

I've been clawing at my mind for a while regarding this, I pray somebody can help me.

I was implementing a method from a computer science paper when I came accross this:

[PLAIN]http://www.mattkent.eu/challenge.png [Broken]

Bearing in mind that x is a vector, epsilon is a vector of integers (and x raised to this is performed component wise), I can't tell what exactly the Sigma is summing.

Firstly what does a sigma of index i mean in the matrix A? What exactly would be summed there? Is it the 'sum of the vector components for a given i' or the 'sum of all vectors where each has been raised to epsilon'?

And secondly, in the definition of the b vector, what is the sigma doing here? There's absolutely no index or limit associated with it. The t value is essentially a constant. So is this saying 'sum up the vectors components (once multiplied with t)' ?

Last edited by a moderator: May 4, 2017
2. Aug 10, 2010

### jambaugh

Assuming that x is a vector of dimension d then the subscript x_i indicates i is the index running from 1 to d. This sigma notation is shorthand.
$$\Sigma_j$$
means sum over the values of j which should be understood from the context.

Thus for example if you want the dot product of x and y:
$\Sigma_i \mathbf{x}_i \mathbf{y}_i$

You might also sometimes see:
$$\sum_{a \in S}$$
where S is a set.

3. Aug 10, 2010

### dodo21

Thanks for the quick response.

Earlier in that section of the report it in fact defines x_i (where i = 1,2... m)to be a vector from a set of vectors (set size m), rather than a component of the vector x.

It then defines that a second subscript (i.e. x_i_j) indicates the vector component of the ith vector (where j = 1, 2, ... d; d is the dimension of the vector).

4. Aug 11, 2010

### jambaugh

OK that is a useful detail. I should have read more carefully.
I would guess then the i is indexing the set of vectors and that is what is being summed over. The definition of B looks to have simply dropped the i. The author appears to have gotten quite terse with the notation.

Parse this in stages.

You know what a matrix is so look at one entry, it is a sum.

You know what summing is (and you'll figure what i ranges over from context) so look at the terms of the sum.

You see each of a set of vectors indexed by i, (so the i will index the elements of that set) and you say the power notation is explained and should then yield a vector. You can add vectors so that's not a problem, if you can evaluate the powers then that also shouldn't be a problem.

Note: Since sums of matrices equal sums of corresponding entries you can probably rewrite A as:

$$\mathbf{A} = \sum_i \left[ \begin{array}{ccc} \mathbf{x}_i^{\epsilon_1+\epsilon_1}& \cdots & \mathbf{x}_i^{\epsilon_1+\epsilon_0}\\ \vdots & \ddots & \vdots \\ \mathbf{x}_i^{\epsilon_0+\epsilon_1}& \cdots & \mathbf{x}_i^{\epsilon_0+\epsilon_0} \end{array}\right]$$

What I'm confused about is the ... in the matrices. That should indicate an obvious pattern but at face value I only see what should be 2x2 entries.

I might be able to understand better if you give a little context. What's the paper and what are A and b and gamma supposed to be?

5. Aug 11, 2010

### dodo21

The Matrix is of variable dimension, the author chose the letter 'o' to signify 'order' and so the matrix is o-by-o, depending on what 'o' is.

I've taken this from the appendix of "Generalizing Surrogate-assisted Evolutionary Computation", by Lim et al. It's within the authors description of Polynomial Regression with an input of multiple dimensions.

In the context of polynomial regression (with a multiple dimension input vector), x_i is an input vector of dimension d, for i = 1, 2, ... m.

Epsilon_j is an 'exponent vector' for j = 1, 2, ... o. While o is the order of the polynomial+1 being regressed to. Each Epsilon vector contains d integers (the exponent values).

W.r.t. the vector b: the t_i value is the output of the function being modelled, associated with ith x vector.

Finally the gamma vector contains vectors of constants; C_i for all i = 1,2...o.

So the coefficient vector of vectors (does that necessarily result in a matrix if all nested vectors are the same length?) can be found:

gamma[transposed] = (A[inverse]*b[transposed])

I was speaking with a colleage this afternoon and we both came to the conclusion that in the matrix the sigma is indicating "sum for a given i"; similarly for the b vector. But I don't understand why the author would specifcally note "for a given i" in the matrix and then not do so in the vector.

6. Aug 11, 2010

### jambaugh

Ok, that makes sense.
Ahhh! That (the need to sum vector powers) makes sense in the context of regression calculations.

Then the vector to vector power gives you a term in the polynomial. I presume then that you don't get a vector but rather the product of the X components raised to the epsilon powers. Like:
$$x^2 y^3 z = (x,y,z)^{[2,3,1]}$$

It was probably a matter of neglect. With such involved formulas one tends to ignore "obvious" details such as over what one is summing.

EDIT: PS, It may be quite helpful to review some other references on polynomial regression in one and many variables to compare formulas. Likely part of the author's neglect was in that he assumes the reader is somewhat familiar with polynomial regression.

Last edited: Aug 11, 2010
7. Aug 11, 2010

### dodo21

The author referenced an article where I assume he got this definition. A 1956 article by F.H. Lesh; and the notation was no different (aside from the swapping of several letters).

That said, I shall search around the world wide web for further definitions to see if any enlightenment is offered.

Thanks for your input jambaugh. It's genuinely appreciated.