Autocorrelation (Econometrics)

Autocorrelation can be defined as correlation between the variables of some observations at different points of time if it is about a “ time series data”, or it will be correlation between the variables of some observations at different space if it is about “ cross sectional data”. The classical linear regression model (CLRM) according to the regression context does not exist in the error (ui) this can be written in this form
E(uiuj) = 0 i ? j

The classical model have an assumption saying that the disturbance of a certain observation does not change or be effected from the disturbances of any other observations, but if the equation was like this:
E(uiuj) ? 0 i ? j
then in this case the disturbance in a certain observation can be affected by the disturbance in other observations.
When the autocorrelation is calculated, the number that results can fall between +1 and -1. When the result is autocorrelation of +1 this means that perfect positive correlation which explains that an increase that happens in one time series will result an increase proportional to the other time series, while if the result was -1 this means that perfect negative correlation occurs and this explains that if an increase happened in one time series the results will be proportionate decrease in the other time series.

This type of value can be used in a useful way for computing for security analysis. For example,
“if you know a stock historically has a high positive autocorrelation value and you witnessed the stock making solid gains over the past several days, you might reasonably expect the movements over the upcoming several days (the leading time series) to match those of the lagging time series and to move upwards.”
There are some consequences if we used OLS while the presence of autocorrelation, the estimators of the OLS will be still linear unbiased as well as “ consistent” and “asymptotically” normally distributed, but they will not be efficient any more, the same thing also happens with the heteroscedasticity.
There are four methods to detect the autocorrelation which are:
Graphical Method
The Runs Test
Durbin-Watson d test
A general test for autocorrelation: the Breusch-Godfrey or (BG) test
Let’s talk about each one separately
The first is the graphical method: “ the assumption of non autocorrelation of the classical model relates to the population disturbance ut , they are not directly observable, but instead of them we have their proxies ?t which we can obtain by using the normal OLS steps.”

There are a lot of ways to examine the residual, the easiest is that we can plot them against time and this is called “ Time sequence plot” also we can plot the “ Standardized Residual” against
time, the standardized residuals are not the real residuals they are the samples which are made
like this “?t “ their values are always pure numbers and we can make a comparison between it and the standardized residual of another regression.
The second is the “ Run Test” and sometimes it is called “Geary test which is a “ test that we make no assumption about the distribution from which the observations are drawn or it is called a non parametric test” : this occurs when the set of data that we are working on is divided in to negative and positive observations, then in this data we will find several residuals that are positive and another several residuals that are negative. The run is defined as “ an un interrupted sequence of one simple” which means that for example if we have in the observation ten positive numbers under each other then they are in a set alone and then we had eleven negative numbers they will be in a set alone and then we had another five positive numbers so this will be another set and not added to the other positive set they are written like this:
(++++) (——) (+++++++)
The runs length is determined through the number of elements inside every set for example in the example here we have three runs the first is containing four pluses the second is containing six minuses and the third run is containing seven pluses.
“ One can derive a test of randomness of runs, by examining how runs behave in a strictly random sequence of observations.” We must look to the number of runs that we have for instance in the previous example they are consisting of three runs and seventeen observations.
“ we must look if this is too many or too few compared with the number of runs expected in a strictly random sequence of seventeen observations.” “ if it is consisting of too many runs it means that the residual changes signs a lot so the indicates negative serial correlation, and if they were containing of very few runs they may suggest positive auto correlation.”
There are some shortcuts that are done to make things become easier which are:
“ N: Total number of observations = N1+N2”
“ N1: Number of the plus residuals”
“ N2: Number of minus residuals”
“ R: Number of runs”
“ Under the null hypothesis the successive outcomes are independent and we make an assumption that N1 > 10 and N2 > 10”
“ Mean: E(R) = (2 N”1N2″ )/N+ 1”
“ Variance: ?_(R= (2N_(1 N_(2 ( 2 N_1 N_2-N)) ))/((?N)?^2 (N-1)))^2”
The third is the “ Durbin Watson d test” it is the most known test for getting the serial correlation and it is defined in the following equation:
d = (?_(t=2)^(t=1)??(?t-??t ?_(-1))^2 ?)/(?_(t=1)^(t=n)???t ?^2 )

This equation is the “ ratio of the sum of squared differences in successive residuals to the RSS” the number that we get from the numerator of the “ d statistic” equation is “ n-1”.
There is a great advantage of the “ d statistic” which is that it is “ based on the estimated residual, which are calculated in the regression analysis.”
There are six assumptions used in the d-statistic which are:
“ The regression model includes the intercept term. Sometimes it is not presented as in the case of the regression through the origin, it is essential to rerun the regression including the intercept term to obtain the RSS”
“ The explanatory variables are non stochastic or fixed in repeated sampling.”
“ The disturbance are generated by the first order autoregressive scheme: ut = put-1+?_t. Therefore it cannot be used to detect higher order autoregressive schemes.
“ The error term ut is assumed to be normally distributed.”
“ The regression model does not include the lagged values of the dependent variable as one of the explanatory variables.”
“ There are no missing observations in the data.”
There is a line that is from 0 till d
From o till dL “ Reject Ho evidence of positive autocorrelation.”
From dL till dU “Zone of indecision”
From dU till 2 and from 2 till 4-dU we do not reject H0 or H0* or both
From 4 – du till 4 – dL: “ is called the zone of indecision.
From 4 – dL till 4 reject H0* evidence of negative autocorrelation.
There is a rule that says H0: No positive autocorrelation and H0*: No negative autocorrelation.
There are some rules that we have to define from it:Type equation here.
p ?= (???(U_(t ) ) ?(U_t ) ?-1?)/(??(U_t^2 ) ? )
By using the previous equation we can say that
d?2(1-p ?)
There is another equation says -1 ? p ? 1 implies that o ? d ? 4 any estimated d value must lie between these limits. In the first equation if p ? = 0, d = 2, this is only in the case if there is no serial correlation, then d is expected to be 2, if d was 2 in an application then there is an assumption that can be made that “ there is no first order autocorrelation, either positive or negative, if p ? was +1 this indicates perfect positive correlation in the residuals. When d is closer to zero this means that the evidence of positive serial correlation is great, and if there is autocorrelation the residuals will be bunched together and their differences will therefore tend to be small, and as a result of this the numerator some of squares will be smaller in comparison with denominator sum of squares, if p ? was -1 this means that there is a negative correlation among successive residuals, d?4 the closer the d to 4 the greater the evidence of negative serial correlation.

After we assume the assumptions that are listed above then we can go in the procedures of Durbin Watson test correctly which are:
“ Run the OLS regression and obtain the residuals.”
“ Calculate d from d = (?_(t=2)^(t=1)??(?t-??t ?_(-1))^2 ?)/(?_(t=1)^(t=n)???t ?^2 ) and this is a step that most of the computers do.”
“ For the given sample size and the given numbers of explanatory variables find out the critical dL and dU variables.”
The fourth and last one is the Breusch Godfrey or (BG) test:
It is a test that is made to not do the wrong things that happened in the Durbin Watson test of autocorrelation, and this test allows “ no stochastic repressors such as the lagged values of the regressand, and higher order autoregressive scheme.”
Y = B1 + B2Xt +Ut
Assume that the error term (Ut) is as follows:
Ut =P1Ut-1 + P2Ut-2¬ + …+ PpUt-p + ?t
The Ho that is going to be tested is P1 = P2 = … = Pp = 0
The Breusch Godfrey test steps is like this:
“Estimate Y = B1 + B2Xt +Ut by OLS and obtain the residual.”
“ regress Ut on the original Xt.”
“ if the sample size is large (BG) have shown that: (n – p)R2~ X2p
There are many tests for autocorrelation because there is “ no particular test has yet been judjed to be unequivocally best and thus the analysts are still in the unenviable position of considering a varied collection of test procedures for detecting the presence or structure or both of autocorrelation.”
There are some steps that we have to do when we find autocorrelation the first is to try to find out if the autocorrelation is “pure autocorrelation” and not as a result of mis-specification of the model, the second thing is that it was “pure autocorrelation appropriate transformation of the original model can be used, because in the transformed model we do not have the problem of pure autocorrelation, the third thing is “ in a large sample we can use the newly west method to obtain standard error of OLS estimators that are corrected for autocorrelation.” The fourth and last thing is that “in some situations we can continue to use the OLS method”