An Illustration of the Koyck model


Although very useful in Economics, a distributed lag-model with a not defined length of the lag (infinite lag-model) or rather how far back we want go  poses some serious problem of estimation. Ad Hoc estimation implies a sequential procedure. Therefore, first regress Yt on Xt then regress Yt on Xt-1 and so on.  
A test significance of βi (the coefficient of the lagged variable) is run each time until the coefficient start becoming statistically insignificant. This approach, however suffers from many drawbacks such as there is no prior guide of what is the maximum length of the lag; as one estimate successive lags, there are fewer degrees of freedom left; successive values (lags) tend to be highly correlated and the sequential search for the lag length opens the researcher to the charge of data mining.  

The Koyck model proposed a method to estimate distributed-lag model. It assumes an infinite lag distributed-lag model.

Yt = α + β0 Xt+ β1Xt-1+ β1Xt-2+ …. + μt

Hence assuming that the β’s are all of the same sign, Koyck assumes that they decline geometrically. 

βk= β0λk    where k=0,1,… and  0<λ<1  where λ : rate of decline and (1-λ): speed of decline 

This equation postulates that each successive coefficient (β) is numerically less than each preceding β since λ<1, implying that as one goes back into distant past, the effect of that lag on Y became smaller . Therefore the closer λ is to 1, the slower the rate of decline. 
As a result, the infinite lag model can be written as 
Yt = α + β0 Xt+ β0 λXt-1+ β0 λ2Xt-2+ …. + μt (1)

The model cannot be easily estimated in this form therefore, Koyck transformation proceeds with these steps:

 It lags one period to obtain 
 Yt-1 = α + β0 Xt-1+ β0 λXt-2+ β0 λ2Xt-3+ ….+ μt-1 (2)    Then he multiplies by λ to obtain
 λ Yt-1 = λ α + λ β0 Xt-1+ λ β0 λXt-2+ λ β0 λ2Xt-3+ ….+ λ μt-1              (3)   Subtracting (3) from (1) 

he obtains

 Yt- λ Yt-1 = α (1- λ) + β0 Xt+ (μt - λ μt-1)

Thus, rearranging 

Yt= α (1- λ) + β0 Xt+ λ Yt-1+ vt                                                    where vt= μt - λ μt-1 

Therefore we have converted a distributed model into an auto regressive model. This transformation implies that we know have only three unknowns (α, β0, λ); there is no reason to expect multi-collinearity, although we may have a problem of serial correlation because of vt . 


Rational Expectation model


Until the advent of the rational expectations (RE) hypothesis, initially put forward by J. Muth and later propagated by Robert Lucas and Thomas Sargent, the AE hypothesis was quite popular in empirical economics. The proponents of the RE hypothesis contend that the AE hypothesis is inadequate because it relies solely on the past values of a variable in formulating expectations, whereas the RE hypothesis assumes that “individual economic agents use current available and relevant information in forming their expectations and do not rely purely upon past experience. In short, the RE hypothesis contends that expectations are ‘rational’ in the sense that they efficiently incorporate all information available at the time the expectation is formulated and not just the past information. (Gujarati, 2009 p.631)
Under rational expectations it is assumed that individuals possess completed knowledge about the functioning of the economic system and put this knowledge to the best possible use when forming their expectations together with all the available information. 
X*t =  E (X*t / It
X*t is the expectations formed at time t of X in time t+1. This is dependent on the information available at time t (It).
It may turn out that actual X is not the same as expected. That is agents may make a prediction error
  • Xt+1 = Xt* + ut    or       Xt+1-Xt*=ut
ut is the error of the expectation 
E(ut)=0  The expected value of the prediction error is zero. This means that under the rational expectations hypothesis forecasts by individuals are assumed to be correct on average. 
Cov (X*t,  ut) should also be equal to 0. In addition, the prediction error must be uncorrelated with any information available at the time the prediction is made. If not, this would imply that the forecast has not made use of all available information. Since all available information is summed up in the variable X*t , this implies that ut must be uncorrelated with X*t.