Adding X as an Instrument Variable in GMM Estimation

vienna_quant · Aug 11, 2005

Hello,

I have a question concerning the GMM equation specification.
Say we partition each day in 7 intraday intervals. We want to estimate the 7 intraday interval moments for a variable Y observed in those 7 intervals over a period of T days meaning we have t = T*7 total observations for Y.
I first estimate the following model:

Y(t) = c + c(1)*dummy1+c(2)*dummy2+c(6)*dummy6+c(7)*dummy7 +error(t)
(where c represents a constant and dummy1-7 represent dummy variables taking the value of 1 and 0, indicating if the observation occurrs in the according period) I use dummy1 dummy2 dummy6 dummy7 as instrument variables (in moment conditions)

in order to see if the observations in periods 1, 2, 6, 7 (in the morning and or afternoon) are statistically different from the variables in the intervals 3-5. The results will be very similar to OLS regression results except for error homscedasticity and no auto correlation in errors.

QUESTION: Is it legitimate to add another variable -- c(8)*X(t) -- (with the same amount of obseravtions t as Y) as moment such that the equation takes the form ?

Y(t) = c + c(1)*dummy1+c(2)*dummy2+c(6)*dummy6+c(7)*dummy7 +c(8)*X(t) + error(t)

- With OLS this is clearly no problem but with GMM? I am not sure if one can use moment -- c(8)*X(t) -- that does enter the estimation for each the Y(t) observations. as i enter X as additional instrument variable for the orthogonality conditions, i get an error message: near singular matrix ...

thanx for advice
f.

:rofl: :rofl:

EnumaElish · Aug 11, 2005

Do you get a similar msg. if tried an OLS package? Have you tried? It may be that your X is highly correlated with the dummies, or you're running out of degrees of freedom (too few data).

vienna_quant · Aug 11, 2005

tx for quick answer!

i do not get the message with ols -> data is more than 70.000 observations so i don't think that's the problem.

i tried the following model:
Y(t) = c + c(1)*dummy1+c(2)*dummy2+c(3)*dummy3+c(4)*dummy4+c(5)*dummy5+c(6)*dummy6+c(7)*dummy7 +error(t)

the problem comes not from the estimation equation (as no prob with ols) but from the orthogonality conditions i specified:
period1*error = 0
period2*error=0
.
.
period7*error = 0

-> therefore i have an orthogonality condition for each estimated value, and a total of 7 parameters to estimate and 7 orth. conditions -> I get "near singular matrix" error message (with e-views)

Interestingly the following model is no problem:
Y(t) = c + c(1)*X(t)+error(t)

where x(t) spans the complete sample meaning for each observed Y there is an X value.
orthogonality conditon:
X*error = 0

-> no problem estimating this model, even if i have an orthogonality condition for each estimated value

thanx for adWISE
f.

EnumaElish · Aug 11, 2005

If you have 7 periods, can you specify 7 dummies and an intercept? Shouldn't there be 6 dummies, or no intercept? Although I cannot think why OLS wouldn't complain in that case, either.

If that's not it, then GMM package's matrix inversion algorithm must be complaining about too many zeroes in the matrix, making it near-singular. I guess this can be a problem especially if GMM's inversion method involves submatrices, some of which can be a zero matrix because there are too many zeroes overall.

A colleague of mine wrote his Ph.D. thesis on this subject (inverting sparse matrices during estimation of regression equations), so I guess that this can be a problem for many people.

Another solution that I can think of is to play with precision limits. If the OLS package has a lower precision limit than GMM, it may not see a problem where GMM does because of its higher precision.

vienna_quant · Aug 12, 2005

Thanx for your answers so far, i tried some other forrums, but no1 could give ANY advice.

If you have 7 periods, can you specify 7 dummies and an intercept? Shouldn't there be 6 dummies, or no intercept?
--> 7 dummies and no intercept (with all dummies as orth. cond) returns "near singular matrix" as well --> 6 dummies and a constant is no problem!

--> after some trial and error testing i found out:

interestingly it is possible to specify the equation with 7 dummies and underspecified orthogonality conditions ( 6 orth condt, say period1*error = 0 ... period6*error = 0 ) -> Is the underspecification a big deal? results don't change a lot when adding/changing orth conditions...? I am really no expert in this field but it seems as if orthogonality are not fitting, as the residuals are not normally distributed (skew 0.55 and kurt -3.7 and Jarque Berra test with p.value 0.000), actually it estimates exactly the same model as with OLS. OLS estimation results in equal standard errors for all coefficients and GMM results slightly different standard errors but the same estimated coefficients as OLS. I am really puzzled
I am not sure if i should use my model ( as it is for phd ...)

thanx for suggestions,+
any advice is highly appreciated
f.

EnumaElish · Aug 12, 2005

Orthogonality conditions cannot guarantee normally distributed errors. OLS imposes orthogonality (by default), but there is no guarantee that actual residuals generated by OLS will be normal. I am not too familiar with GMM or e-views, but it sounds like you may have to explicitly specify model options that will correct for non-normal errors. E.g., suppose GMM errors are heteroscedastic. Don't you need to call some kind of correction module (or subroutine) that will correct for heteroscedasticity? Or does e-views do this automatically?

Incidentally, do you need an ortho. cond. for your intercept as well?

That OLS produces the same coefficients as GMM does not surprise me. After all, the expected value for each dummy coefficient is [itex]\overline{y_i}[/itex] for the subsample indicated by that dummy (all i such that d_i = 1.) But if the residuals violate OLS assumptions (e.g. homoscedasticity) then OLS is inefficient (will give too large coefficient standard errors) due to this violation.

vienna_quant · Aug 12, 2005

Totally correct, the parameter estimates are the mean value for each subperiod - at least this was also expected from me for the OLS. interestingly the OLS std errors are smaller than the GMM errors.

As written in my first post, my model does not incorporate an interept at all and ideally it would be
Y(t)=c(1)*dummy1+c(2)*dummy2+c(3)*dummy3+c(4)*dummy4+c(5)*dummy5+c(6)*dummy6
+c(7)*dummy7+c(8)*V(t)+error(t)

where Y(t) is a vector spanning over all observations. the only problem remaining is that i can't use all periods in the orth condt. nevertheless it does not seem to make a big difference in estimations when changing parameters used in orth conditions. I will just use 6 of the seven periods and additionally v(T) then.

one last question:
which tests are used to compare the distribution of 2 differtent samples of equal size (not mean adn stdev tests!) I used chi-squared so far but want to test more. i don't know of any other appropriate tests ..
tx
f.

EnumaElish · Aug 12, 2005

vienna_quant said:

the only problem remaining is that i can't use all periods in the orth condt. nevertheless it does not seem to make a big difference in estimations when changing parameters used in orth conditions. I will just use 6 of the seven periods and additionally v(T) then.

My guess is that there must be some kind of adding-up condition such that when you impose 6 ortho. conditions, the 7th is automatically satisfied. You may want to research this.

one last question:
which tests are used to compare the distribution of 2 differtent samples of equal size (not mean adn stdev tests!) I used chi-squared so far but want to test more. i don't know of any other appropriate tests ..

There are several non-parametric tests for assessing whether 2 samples are from the same distribution. For example, the "runs" test. Suppose the two samples are [itex]u_1<...<u_n[/itex] and [itex]v_1<...<v_n[/itex]. Suppose you "mix" the samples. If the resulting mix looks something like [itex]u_1< v_1 < u_2 < u_3 < u_4 < v_2 < v_3 <[/itex] ... [itex] < u_{n-1} < v_{n-1} < v_n < u_n[/itex] then the chances that they are from the same distribution is greater than if they looked like [itex]u_1<...<u_n<v_1<...<v_n[/itex]. The latter example has a smaller number of runs (only two: first all u's then all v's) than the former (at least seven runs: one u, one v, u's, v's, ..., u's, v's, one u). This and similar tests are usually described in standard probability textbooks like Mood, Graybill and Boes.

vienna_quant · Aug 13, 2005

thank you EnumaElish for taking the time!
this forum seems to be a really good place with friendly people!
keep on going like that!
best
f.

Adding X as an Instrument Variable in GMM Estimation

1. What is the Generalized Method of Moments (GMM) in statistics?

2. How does GMM differ from other statistical methods?

3. What are the steps involved in applying GMM?

4. What are the advantages of using GMM?

5. What are some common applications of GMM?

Similar threads

Hot Threads

Recent Insights