Diagonalization of large non-sparse matrices

jsschreck · Jul 27, 2009

Dear physics friends:

I am using a Potts model to study protein folding. In short, the partition function of the problem is written as the sum of the eigenvalues of the transfer matrix each to the Nth power (the transfer matrix factors the expression exp(H/Temperature) where H is the Hamiltonian of the problem). The dimension of the transfer matrix is 3^(odd number) x 3^(odd number). I wrote a small code in Python, and I can diagonalize up to 3^5 x 3^5 (plotting 500 points now takes about 3 hours on a 64 bit machine). I am looking for ways to speed up the calculation, as I need to get at least up to the 3^7 x 3^7 case. The matrix in question has all elements greater than zero, and of the form exp(stuff / temperature). What I am doing is diagonalizing the matrix for various temperatures (eg, compute e-values, shift the temp. slightly, compute, shift, compute, etc). Most of the problems come from pushing Temp -> 0. I can get down to about 0.05 for the 27 x 27 case before eigenvalue overflow problems. I am using the routine eig() in python. Getting close to zero isn't that essential, its the compilation time that is stalling me. Any suggestions on making computation of this problem faster/overall strategy of attack??

John Schreck
Drexel University

lstellyl · Jul 27, 2009

Is there any particular reason for you using python?

I don't have any experience with the linalg module that implements this, but i was under the impression that MATLAB or c++ is pretty fast with eigenvalue/vector solutions and diagonalization... you might want to give that a shot, but I'm not sure what other dependencies you have here with the actual folding model

maverick_starstrider · Jul 28, 2009

3^7 x 3^7 is a VERY small matrix. Pretty much any standard diagonalization technique implemented in something like c++ or fortran should be really fast.

Born2bwire · Jul 28, 2009

maverick_starstrider said:

3^7 x 3^7 is a VERY small matrix. Pretty much any standard diagonalization technique implemented in something like c++ or fortran should be really fast.

I would not think of using Python to handle that large of a matrix, Python is woefully slow in my experience with it for atomic scale simulations.

Once you get above 5000x5000 unknowns you may want to look into more intelligent eigenvalue solvers. A direct way is usually order N³, but a better solver will be N log N. I was just reading a paper that had to solve eigenvalue problems for a large matrices and they referenced:

Z. Zhu, B. Song, and J. White in Proceedings of the Deign Automation Conference (Anaheim, CA, 2003), pp. 712-717.
V. Rokhlin, Journal of Computational Physics 60, 187 (1985).
W. Hackbusch and Z. P. Nowak, Numerische Mathematik 54, 463 (1989)

Personally, I have been using the Intel MKL libraries and have solved 7000x7000 eigenvalue problems in less than a minute or so I think. These are just optimized versions of the Lapack routines. So I do not think you need to move up to better solvers just yet, but I do think you should try using a faster language and look at the Lapack library.

Though the question remains, why are you diagonalizing the matrix? Do you want the eigenvalues or some other end product? Because if you are diagonalizing as an intermediate step, it is generally better to look for algorithms that are designed for your desired end product.

SW VandeCarr · Jul 28, 2009

jsschreck said:

Dear physics friends:

Most of the problems come from pushing Temp -> 0. I can get down to about 0.05 for the 27 x 27 case before eigenvalue overflow problems. I am using the routine eig() in python. Getting close to zero isn't that essential, its the compilation time that is stalling me. Any suggestions on making computation of this problem faster/overall strategy of attack??

John Schreck
Drexel University

What do you mean by pushing temperature: Temp -> 0? Of course you are going to have problems since I doubt you mean absolute zero. Try using degrees Kelvin (0 C = 273 K).

maverick_starstrider · Jul 28, 2009

SW VandeCarr said:

What do you mean by pushing temperature: Temp -> 0? Of course you are going to have problems since I doubt you mean absolute zero. Try using degrees Kelvin (0 C = 273 K).

No they mean absolute zero. It's a Boltzmann distribution like that used in Monte-Carlo. Anywho, now that I've actually READ your post (I only glanced at it before) it doesn't actually sound like it has anything to do with processing time. Your eigenvalues sound like they're blowing up because your Boltzmann factors exp(stuff/temp) are blowing up. I assume you're using double precision. How low are you hoping to go in terms of temperature?

SW VandeCarr · Jul 28, 2009

maverick_starstrider said:

No they mean absolute zero. It's a Boltzmann distribution like that used in Monte-Carlo.

Well, if he's studying protein folding, I don't think he's going to see much action below 268K.

http://www.jbc.org/cgi/reprint/268/34/25914.pdf

http://employees.csbsju.edu/HJAKUBOWSKI/classes/ch331/protstructure/olunderstandconfo.html

SW VandeCarr · Jul 28, 2009

SW VandeCarr said:

Well, if he's studying protein folding, I don't think he's going to see much action below 268K.

jsstreck

The lowest temperatures I've found in studies of protein folding is -10C (263K). I could see temperatures as low as -20C; but this is still a long way from absolute zero. Perhaps you are trying something new. I've been away from this for a while. If there's something to be gained from ultra-cold experiments, I'd like to hear about it. I haven't turned up anything on a web search, but I may not be using the right search terms.

http://www.whatislife.com/reader/motion/motion.html see Experimental Procedure under 'Protein Folding' 2nd, paragraph

EDIT: In any case, thermodynamic entropy goes to infinity at absolute zero.

jschreck · Sep 25, 2009

Sorry it took me forever to respond, I forgot to ask for emails when people post replies!

Anywho, some folks on here have suggested not seeing much action below ~270K, this is correct (to my knowledge). I am concerned more about numerical problems that result from having all matrix elements of the form \exp(E / kT ) .. where T -> 0 (when you set k_Boltzmann = 1), some other number in the 200s if I plug in k or use the universal gas constant, R (obviously you have to rescale the energy, which experimentally we found is a monotonically increasing function of T). Ill spare the details, but numerically I am dealing with a matrix full of products elements of the form exp( (a+bT) / kT) where I have 3 sets of constants (a,b). The problem I still have to some degree is that if I push the T past some limit in Python (while decreasing T), the eigenvalue routine will not converge. For certain cases it fails near temperatures of interest.

I have been able to do this for the case 3^7 x 3^7 on our cluster, took about 4 hours to run the eigenvalue routine 100 times, iterating over T (my matrix size is 3^odd x 3^odd, where odd means an odd number). I would like to diagonalize up to a 3^15 x 3^15. I live in PA, and so have access to a supercomputer center in Pittsburgh where on some machines the time is free (provided you are accepted). My advisor was just using the machine recently, so it wouldn't take long to get on, but this could be unnecessary?

That said, I think I should take those of you have said to try c or c++. A lot of my friends (grad students) have suggested this as well. I started using Python mainly because other people in my group use it, and the syntax is very easy (also please note that I am already using the LAPACK tools that you can import into Python as well as using double precision). Converting to c++ I doubt is hard...does anyone have any specific suggestions on going about this?

Phyisab**** · Sep 25, 2009

python seems to be the standard in modeling protein folding as far as I have seen.

Diagonalization of large non-sparse matrices

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

High School Interesting paper on QM in Scientific American

Undergrad ##r-##independent angular momentum in quantum mechanics

Graduate Consistency of Relativistic QM

Graduate Some derivation in QFT in Curved SpaceTime by Birrell and Davies

High School Seemingly odd quantum tunneling

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect