Kalman Filter EM decrease log likelihood

emmasaunders12 · Feb 17, 2014

Hi all

I am using the Kalman Filter with an EM algorithm, (schumway and stoffer).

From my understanding the log likelihood should monotonically increase.

In some instances however I obtain a decrease in the log likelihood,

What can I infer from this?

Has anyone experience of these issues or know if it's ok to simply stop the algorithm whenever there is an increase?

Thanks

Emma

D H · Feb 17, 2014

You haven't said how you are combining a Kalman filter with EM (sounds interesting!), so it's a bit tough to say what's going on.

You do not want the error covariance in a Kalman filter to monotonically decrease. That's not a good Kalman filter. That's a "don't confuse me with the facts" filter. If your inputs are always of the same type, same quality, and arrive at a fixed rate, your filter should settle down to a steady state where decrease in the covariance from the Kalman update and increase in covariance from the plant noise balance each other out. You need that plant noise or your filter will essentially reject incoming data as irrelevant.

If you get different types of inputs at different rates, some perhaps even asynchronously, you should see the error covariance increase during intervals of low quality inputs, and then drop when you get a high quality input. The ideal behavior in terms of the covariance is a saw tooth pattern. You don't want the covariance to blow up (your estimated state is meaningless), but you don't want it to collapse either (your estimated state is a fiction).

You are somehow adding EM to the mix. You didn't say how, you didn't say how (or even if) you are relating the error covariance to the log likelihood. One possibility is that in periods when the covariance increases the log likelihood decreases. Another possibility: You are pushing the EM algorithm too hard. That's an ever present issue with machine learning. The EM algorithm side of your system is perceiving something as being signal rather than noise, while the Kalman filter side perceives it as noise rather than signal.

emmasaunders12 · Feb 18, 2014

Hi DH, thanks for the reply.

I am using the EM algorithm to estimate the parameters of the kalman model. There is a closed form expression for the joint likelihood of observation and hidden state. The E step uses the solutions from Kalman smoothing, the maximizing solutions of the joint pdf can then give new parameter estimates of the system. Bear in mind I'm not using a physical model here, my model is learned from training data.

I do get an increase in the error covariance of the smoother output but not sure what I would do to circumnavigate this, any further suggestions?

Thanks

Emma

emmasaunders12 · Feb 18, 2014

Do add to the comment DH,

the trace of the covariance decreases so I'm assuming this satisfies the condition you mentioned previously.

The positions I am tracking however I have the ground truth data for, so even when I get reduced error covariance, this doesn't necessarily coincide with minimum error from the ground truth state, is there any reasons you may have for this?

Thanks

Emma

D H · Feb 18, 2014

You must be modeling something; how else could you use a Kalman filter? Plant noise is an essential part of any Kalman filter. Without it, the covariance will eventually collapse, making the filter ignore all inputs. If you have reasonable plant noise, then inputs of varying nature can naturally make the error covariance grow. For example, a Kalman filter whose ultimate outputs are position and velocity will exhibit a growing covariance matrix when the only inputs are acceleration. The filter need occasional measurements of something other than acceleration to keep a firm grasp on the estimated state. A good velocity measurement will directly reduce the velocity uncertainties and indirectly reduce the position uncertainties, while a good position measurement will directly reduce the uncertainties in the position and indirectly reduce the uncertainties in velocity.

So in one sense seeing a temporarily decreasing log likelihood (increasing covariance) is okay. What's not okay is making your system seeing something that isn't there, and you have to somehow assure yourself that this is not what is happening. There are at least two ways to make this bad step. One is overfitting. You need to have some model of what is perceptible so as to avoid the problem of trying to fit noise.

Another problem is seeing something else. The perhaps apocryphal story of the neural net trained to detect enemy tanks comes to mind. This system supposedly identified terrain containing tanks, but what it actually identified was terrain containing trees. The training data comprised images in which tanks were present and other images in which tanks were not present. All of the images that contained tanks also contained trees. The tanks were hidden in plain sight amongst the trees. The images that didn't contain tanks were more or less treeless. The neural net learned to distinguish treeless from tree-ful images. It did not recognize tanks.

emmasaunders12 · Feb 18, 2014

Hi DH, when I say I'm not using a physical model I mean the model isn't based on physical principles such as velocity and acceleration. The parameters of my model adapt, the KF is run again and a new likelihood obtained, at the moment as soon as I get a decrease in the log likelihood I stop the algorithm. I am using feature extraction algorithm to circumnavigate fitting to noise

D H · Feb 18, 2014

Fitting to noise is an ever-present problem. You need to have some knowledge regarding the limits of your data. It doesn't matter if you are guiding a spacecraft toward Mars or trying to predict the stock market. Your data are imperfect.

Kalman Filter EM decrease log likelihood

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect