Large weighted least squares system

vibe3 · Apr 17, 2015

I have a large weighted least squares system with millions of observations. I can't store the least squares matrix in memory, so I only store the normal equations matrix:

[tex] A^T W A[/tex]

where [itex]A[/itex] is dense and n-by-p and [itex]W = diag(w_1,...,w_n)[/itex]. It takes a very long time to build this normal equations matrix due to the large number of observations. However, I also want to iteratively re-weight the observations to reduce the effect of outliers, and so at the next iteration, the weight matrix will be the product of the original weights [itex]W[/itex] with some new weights [itex]V = diag(v_1,...,v_n)[/itex]. Therefore I need to construct the matrix:

[tex] A^T W' A[/tex]

where [itex]W' = diag(w_1 v_1, w_2 v_2, ..., w_n v_n)[/itex].

Does anyone know of a slick (fast) way to determine the matrix [itex]A^T W' A[/itex] given the matrix [itex]A^T W A[/itex] and the new weights [itex]V[/itex]? I am hoping I don't need to start over from scratch to build [itex]A^T W' A[/itex] for each iteration, since it takes so long.

I did search the literature and didn't find much on this topic.

SteamKing · Apr 17, 2015

Yeah, forget forming the normal equations altogether. You should use a more modern LS algorithm which incorporates either the QR or the SVD algorithm.

If the matrices are too large to store in memory at one time, then use an out-of-core solver.

vibe3 · Apr 17, 2015

SteamKing said:

Yeah, forget forming the normal equations altogether. You should use a more modern LS algorithm which incorporates either the QR or the SVD algorithm.

If the matrices are too large to store in memory at one time, then use an out-of-core solver.

My matrix [itex]A[/itex] has 8 millions rows (observations) and 2000 columns (parameters). The normal equations matrix [itex]A^T W A[/itex] is therefore 2000-by-2000, which can easily be handled by LAPACK to solve for the unknown parameters.

I can't even store the full matrix A in memory to attempt to do a QR or SVD on it. How do you propose I use these decompositions? Do you have a reference?

MarneMath · Apr 17, 2015

If you can't store it in memory, then don't. If you're familiar with a mapreduce rent an amazon cluster and use a hadoop system to run a QR decomposition in map reduce land. It'll generally compute faster and memory won't be an issue.

Also relating to SteamKing's comment. Here's a paper on how to use out of core.

http://www.tau.ac.il/~stoledo/Bib/Pubs/pp01-qr.pdf

It's specifically used in cases that a matrix cannot be stored in memory but only on disk.

(Although personally for me, a mapreduce on a cluster is easier to handle)

SteamKing · Apr 17, 2015

vibe3 said:

My matrix [itex]A[/itex] has 8 millions rows (observations) and 2000 columns (parameters). The normal equations matrix [itex]A^T W A[/itex] is therefore 2000-by-2000, which can easily be handled by LAPACK to solve for the unknown parameters.

It's not clear that solving that many normal equations will produce a useful result. Round off error in solving that large a system of normal equations could produce a garbage regression.

I can't even store the full matrix A in memory to attempt to do a QR or SVD on it. How do you propose I use these decompositions? Do you have a reference?

Not at the moment. With such a large problem to solve, QR or SVD may not be practical. You may have to resort to a solution by iteration.

The point is, the numerical techniques which may work well for finding a regression line thru a handful of data points may not be ideally suited when you have millions of data points and thousands of parameters.

Large weighted least squares system

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Graduate Confusion about the Moyal-Weyl twist

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect