# Householder algorithm

svishal03
Hi,

I need to write a code to convert a symmetic matrix into tridiagonal form and am planning to use the Householder algorithm.

I understand the mathematical steps.

Can anyone explain that when we are converting a matrix "A" into a tridiagonal form what does it physically indiacte?

I mean what are doing? Is it that we are taking each column of the matrix i.e.a vector and making the necessary elements 0 so as to get a tridiagonal matrix which has non zero elements just along the principal diagonal and the first diagonal below this, and the first diagonal above the main diagonal.

I shall be grateful if someone puts it's point of view.

Vishal

You should begin by understanding that it is imprecise to say that the algorithm "converts" the matrix to tridiagonal form. If we merely wanted to "convert" the matrix to tridiagonal form, we could simply set all its non-tridiagonal entries to zero. Why are only certain procedures allowed for eliminating the non-tridiagonal entries? Doesn't the output of the algorithm give a way to represent the matrix as a product of factors, one of which is a matrix in tri-diagonal form?

As to why one wants to represent something as a product of factors, you could ask the same question in ordinary algebra. Why factor $a^2 - b^2$ as $(a+b)(a-b)$? There are a variety of situations where one representation is more convenient in another. In the case of the "QR" factorization, I think it is used in algorithms that find the eigenvalues of a matrix. The technicalities of that are a topic to be discussed after the Google blackout is over!

svishal03
Thanks a lot.

Actually, I have been looking in this explanation;

http://math.fullerton.edu/mathews/n2003/householdermod.html [Broken]

As it explains, the Householder method is used to construct a similar symmetric triadiagonal matrix.

Now, similar matrices are those which posses several similar properties and one of the similar properties being that they possesses similar Eigen values (which is what I finally need).

In linear algebra, two n-by-n matrices A and B are called similar if

B= P^-1 A P

for some invertible n-by-n matrix P

Fianlly we get a final similar matrix at the end of of ( n-2)Householder orthogonal transformations.

I do not follow when you say;

Doesn't the output of the algorithm give a way to represent the matrix as a product of factors, one of which is a matrix in tri-diagonal form?

Sorry for a fundamental question...can you please explain the statement?

Last edited by a moderator:
Which aspect of the statement needs explaining? As you observed, the definition of matrices A and B being "similar" is that we are able to factor one matrix B in a special way such that the matrix A appears as a middle factor. The link you gave points out that Householder's algorithm makes this middle matrix is a tri-diagonal matrix. Is the question "What is the utility of a representing B this way?" or is the question "How does the algorithm accomplish this representation?" - in particular "How do we recover P from the data that is produced from the steps of the algorithm?". (I implemented this algorithm in C many years ago, but I'd have to study your link to answer that question!)

I'm not sure of what your background in linear algebra is. The factorization $B = P^{-1} A P$ is a very natural transformation in applied math, since it implements a linear transformation of coordinates. Suppose you have a matrix A whose columns (or rows) represent vectors in one coordinate system and you want to do a linear (and 1 to 1) change of coordinates ( the kind that that books usually write like x' = ax + by, y' = cx + dy, where the primes indicate the new coordinates). It turns out that you can accomplish that change of coordinates by the operation $P^{-1} A P$ for some appropriately chosen matrix $P$.

So if $B = P^{-1} A P$, you can think of $B$ as being $A$ expressed in a different coordinate system.

svishal03
Thanks a lot for the reply.Actually, I'm doing a self study course in Linear Algebra.

I'm also applying what I'm (self)learning to my problems relate to my field of Structural Engineering.

As I see here (sorry for being immature in Linear Algebra) we have a matrix B which we have factorised as;

B = P^-1 A P

You said that this could be considered as a case of coordiante transformation wherein we have changed coordiantes from a system x,y,z to x',y',z'

When we say a change in coordiante system does it mean that x'y'z' are such that each vector (each vector being represented by each column of the matrix B) in x'y'z' coordiante systems has the same length and direction as vector in x,y,z coordiante system (each vector being represented by each column of matrix A)?

B = P^-1 A P

You said that this could be considered as a case of coordiante transformation wherein we have changed coordiantes from a system x,y,z to x',y',z'

Yes, but admittedly I didn't make a very precise statement of how to do this.

When we say a change in coordiante system does it mean that x'y'z' are such that each vector (each vector being represented by each column of the matrix B) in x'y'z' coordiante systems has the same length and direction as vector in x,y,z coordiante system (each vector being represented by each column of matrix A)?

For now, I'll say no. Let me try to make my statement more precise.

One way to regard a matrix is as a "linear transformation" ( or to a computer programmer, as a specification of an algorithm). For an invertible matrix P and column vector v, the equation w = Pv can be regarded as computing a representation for the vector v in a different coordinate system, the result of that computation is w.

Another way to regard w = Pv is to think of P as specifying a movement of the vector v to a new location w, all this taking place in the same coordinate system.

For example
$$\begin{pmatrix} \cos(\pi/6) \\ \sin(\pi/6) \end{pmatrix} = \begin{pmatrix} \cos(\pi/6) & -\sin(\pi/6) \\ \sin(\pi/6) & \cos(pi/6) \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix}$$

can be regarded as computing the rotation about the origin of the vector (1,0) by an angle of $\pi/6$ or it could be regarded as representing (1,0) in a new coordinate system that is rotated about the origin by an angle of $-\pi/6$ relative to the original coordinate system.

Thinking of the computation as a change of coordinates, the rows of the matrix represent the unit vectors of the rotated coordinate system give as vectors in the original coordinate system.

There is a famous trick question "What happens to a vector when we change coordinates". The answer for physicists is "nothing" since it is representation that changes, not the phenomena. If I think a matrix A as specifying a motion, it describes a process with some physical reality. So its reasonable that the same motion might be represented by different matrices if we change coordinates. A simple example of that is rotation in 3D about an axis. If we pick a coordinate system where the axis of rotation coincides with one of the coordinate axes, it is represented by a matrix looks like the above matrix embedded in a 3x3 matrix with a 1 and 0's filling the matrix out. If we pick a coordinate system where the axis of rotation doesn't coincide with a coordinate axis the matrix is more complicated.

My statement that B = P^{-1} A P represents a change of coordinates can be interpreted to mean that if we think of A as an algorithm that computes a certain movement, then B represents exactly the same movement, but in a different coordinate system.

There is probably a good interpretation of that equation when we regard A as specification for changing coordinates. I'll have to think about that.

An excellent (i.e. thin, cheap, easy) book on matrices as transformations is the Dover book "Matrices And Transformations" by Anthony Pettofrezzo.

Last edited:
Here's an amusing way to view B = P^{-1} A P.

The boss comes to you and says "Our competitor MoverAmCo has a matrix A that peforms this wonderful displacement. Here are all the numerical entries in the matrix A. I want you to reverse engineer it and create an equivalent product for us. Of course, I want our version of it to use our coordinate system instead of MoverAmCo's."

A co-worker tells you that the way to covert the vector x expressed in your company's coordinate system to MoverAmCo's coordinate system is do the computation y = Px where P is an invertible matrix, which he writes down for you.

You figure the way to kludge up a copy of A in your company's coordinate system is the following algorithm.

1. Given a vector x in your company's coordinate system, compute y = Px to convert the vector to the MoverAmCo coordinate system.

2. Apply MoverAmCo's matrix A to y (which makes sense since y is in the MoverAmCo coordinates now) This computes z = Ay

3. Convert the vector z back into your company's coordinate system. To to this compute w = P^{-1} z.

P^{-1}(A(Px))

Since matrix multiplication is associative, you realize that you can do the computation as (P^{-1} A P) x. So your company's version of the matrix for the movement is B = P^{-1} A P.

svishal03
Again, many thanks for your help.

A fundamental question- in case of Householder transformation which is an approach used to obtain a simialr matrix to a given matrix.

When we say similar, the similar matrix has many common properties to the given matrix (eigen values,rank,determinant,trace,..) but obviously not all properties are same.

In this case (meaning the two matrices does not have ALL properties same) can we say that we are doing coordinate transformation here??

Because, if it was coordiante transformation the matrix should remain EXACTLY similar as the coordiante system is only changed.As you say;

What happens to a vector when we change coordinates". The answer for physicists is "nothing"

In that case there is something happening to the original matrix- as not al properties of 2 matrices A and B are same.

Am I right when I say: Tansforming a matrix A to B using P^{-1} A P is NOT coordinate transformation?

Vishal

When we say similar, the similar matrix has many common properties to the given matrix (eigen values,rank,determinant,trace,..) but obviously not all properties are same.

In this case (meaning the two matrices does not have ALL properties same) can we say that we are doing coordinate transformation here??

Because, if it was coordiante transformation the matrix should remain EXACTLY similar as the coordiante system is only changed.As you say;

Coordinate transformations don't preserve all properties of things. They don't preserve the numerical entries of matrices or vectors. So it isn't true that coordinate transformations preserve all properties.

It is a rather circular way of saying things to say the following: "Coordinate transformations preserve all the properties of a thing which don't depend on the particular way the thing is represented." - i.e. coordinate transformations preserve the things that they preserve! However, one may take the view that the properties of a thing that coordinate transformations preserve are the "essence" of a thing.

Which non-preserved property concerns you?

In that case there is something happening to the original matrix- as not al properties of 2 matrices A and B are same.

Am I right when I say: Tansforming a matrix A to B using P^{-1} A P is NOT coordinate transformation?

We must define what is meant by "a coordinate transformation of a matrix" Some possible meanings of the question.

1. Suppose we pretend the nxn matrix is simply an n^2 dimensional vector. So we view a 2x2 matrix B as the vector (b[1][1], b[1][2], b[2][1], b[2][2] ). Then does the equation B = P^-1 A B establish a linear transformation between two n^2 dimensional vectors?

2. Suppose we regard the matrix B as a list of n column vectors, so a 2x2 matrix is regarded as the list of vectors: { b[*}[1], b[*][2] }. Then does the equation B = P^{-1} A P establish a linear transformation between two n dimensional column vectors that is applied to each of the column vectors?

3. Suppose we regard the matrix A as a specification for an algorithm which takes as input an n dimensional column vector x and produces as output an n dimensional column vector y according to the the forumula y = Ax. Then does the equation B = P^{-1} A P define the specification for the same algorithm, when the coordinates of the input and output vectors are transformed according to the formula v' = Pv where v is the vector represented in the coordinate system for B and v' is the vector represented in the coordinate system for A.

The answer for 3. is Yes. I justified that in my previous post. That is how I think of "the coordinate transformation of a matrix".

I haven't even thought about 1. and 2. I suppose we can figure them out quickly by doing some simple examples. Are 1. and 2. the questions that concern you ?

svishal03
Thanks a lot Stephen- I'm getting clearer now.I was out of my city for the last few days and had not been online, hence, the delay in replying.Thanks...

I had another question- what is the differnce between a transformation and decomposition?

Do all transformations of matrices result in similar matrices?

I'm currently reading about QR transformation wherein first we decompose a matrix 'A' into a orthogonal (Q) matrix and an upper triangular 'R' matrix.

Having done that we just vreverse the order from QR to RQ and since R = Q^TA

we get RQ = Q^T A Q [ which is similar to P^T A ^P discussed in the this theread]

We see that QR transformation again involves obtainign a similar matrix to given matrix 'A'.

I was concerned what is that physically doing when we decompose A = QR?

I had another question- what is the differnce between a transformation and decomposition?

I don't think the term "transformation" has a precise mathematical definition in the context of matrices. I'm not even sure that "decomposition" has a precise definition in that context. A phrase with more information like "a transformation using row operations" or "an LU decomposition" would say something specific.

Just commenting these words as ordinary English terms, "transformation" is often used in a phrase describing a method for changing a matrix into a similar matrix. "Decompostion" is often used in a phrase describing how to express a matrix as product of factors. So a "decomposition" doesn't really change the matrix to a different matrix.

I was concerned what is that physically doing when we decompose A = QR?

As I said earlier, the simplest way to visualize a physical interpretation of a matrix is to think of it as a specification for an algorithm that implements a linear transformation. If you think of the matrix A as an abbreviation for the function y = Ax where x is a vector, then the "meaning" of Ax can be taken either as 1) moving the vector x to the vector y or 2) computing the representation of the vector x in a new coordinate system

With that interpretation, the product of two matrices is just the composition of functions.
i.e. considering A to specify f(x) = Ax and B to specify g(x) = Bx then AB specifies h(x) = f( g(x)).

So when you write a matrix as a product of factors then you are expressing a function as a composition of functions. As to how to visualize the geometric meaning of QR as a composition of functions - I confess that I've never tried to do that. After a certain period of exposure to matrix algebra, you begin to accept it as a subject on its own.

For example, if you had a high school algebra student who said "x is a length, x^2 is an area and x^3 is a volume, but how can I interpret $e^{-x} + 3x + x^{2/3} ?$, you'd have to encourage him to begin thinking about mathematical expressions as "objects" in their own right, not as things necessarily having a physical interpretation.

svishal03
Code:
With that interpretation, the product of two matrices is just the composition of functions.
i.e. considering A to specify f(x) = Ax and B to specify g(x) = Bx then AB specifies h(x) = f( g(x)).

I did not get this...How can matrix be a function..I mean we (or I) have looked at functions as merely polynomials till now..Can you give a simple example like a 2x2 matrix being expressed as a product of functions?

I did not get this...How can matrix be a function..

I'm not saying that "a matrix IS a function, by definition", I said that a matrix "is" or "can be regarded as" a specification for a function. Think of the matrix A as specifying the vector valued function y = Ax where x and y are column vectors whose column dimension matches the row dimension of A.

I mean we (or I) have looked at functions as merely polynomials till now..

I hope not!

When you read mathematics, do you make up your own interpretations for definitions? That's not bad as a supplement to the real definitions (in a way, it's what I'm doing by saying that a matrix can be regarded as a specification for a function), but you can't substitute your own thoughts for the actual definitions. If you have read any modern mathematical texts at all, you should know the modern definition for a function. Look it up!

Can you give a simple example like a 2x2 matrix being expressed as a product of functions?
Let a 2-D column vector $x$ be represented in its components by $\begin{pmatrix} x_1 \\ x_2 \end{pmatrix}$.

$$A = \begin{pmatrix} 3 & 0 \\ 0 & 3 \end{pmatrix}$$
$$f(x) = Ax = \begin{pmatrix} 3 & 0 \\ 0 & 3 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} 3 x_1 \\ 3 x_2 \end{pmatrix}$$
$$B = \begin{pmatrix} \cos(\pi/6) & \sin(\pi/6) \\ - \sin(\pi/6) & cos(\pi/6) \end{pmatrix}$$
$$g(x) = Bx = \begin{pmatrix} \cos(\pi/6) & \sin(\pi/6) \\ - \sin(\pi/6) & cos(\pi/6) \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} = \begin{pmatrix} \cos(\pi/6) x_1 + \sin(\pi/6) x_2 \\ -\sin(\pi/6) x_1 + \cos(\pi/6) x_2 \end{pmatrix}$$

$$h(x) = f(g(x))$$

You should be able to work out $f(g(x))$ and see that it is the same function as $AB x$

Hassan2
Hi svishal03,

I am having similar problems with linear algebra.

I am trying to do a vibration analysis ( modal analysis) of a structure. I need to find eigenvalues of a large matrix. I have transformed my matrix to its similar tridiagonal one using Lanczos algorithm, but I don't know how to find the eigenvalues of the tridiagonal matrix efficiently. Do you have any material regarding this?