The study of autocorrelation in time series was considered
from the end of the seventies (Bono & Arnau, 1996, 2000; Busk & Marascuilo,
1988; Escudero & Vallejo, 2000; Glass, Willson, & Gottman, 1975; Gorsuch,
1983; Hartman et al., 1980; Huitema, 1985, 1988; Jones, Vaught, & Weinrott,
1977; Rosel, Jara, & Oliver, 1999; Sharpley & Alavosius, 1988; Suen,
1987; Suen & Ary, 1987; Vallejo, 1994, among others). In the nineties, the
works were oriented towards the correction of the bias generated in short timeseries
by the conventional autocorrelation estimator r_{1}. Matyas
and Greenwood (1991) verified that the difference between the mean of the conventional
estimator r_{1} and the value of the autocorrelation parameter
ρ_{1} increases with short time series. These authors concluded
that the relationship between ρ_{1} and the mean r_{1}
was nonlinear. The greatest coincidence between calculated mean r_{1}
and ρ_{1} occurred when ρ_{1} was around
0.2 or 0.3. According to these values, when ρ_{1} increased
positively, the calculated mean r_{1} underestimated r_{1}
with a negative bias, while when the ρ_{1} increased negatively,
the calculated mean r_{1} underestimated ρ_{1},
but with a positive bias. Based on this simulation study, it was deduced that
when small sample sizes were used, the r_{1} values were biased.
Huitema and McKean (1991) proposed the modified estimator r_{1}+.
This estimator corrects the empirical bias for positive values of the autocorrelation
parameter by adding 1/n to r_{1}. Arnau (1999) and Arnau
and Bono (2001) proposed the estimator r_{1} which consists
of the correction of r_{1} by the absolute value of a fitting
model. This model is obtained from the polynomial function of the bias for each
sample size. The estimator r_{1} reduces the bias for small
sample sizes, both for positive as well as negative values of ρ_{1}.
This article describes a computer program using MATLAB, Version 5.2 (1998). The program, called Bias, makes it possible to calculate the empirical bias in r_{1}, r_{1}+ and r_{1} by Monte Carlo simulation. This program requires the specification of the parameter ρ_{1} and the sample size (n). The structure of the program is as follows (see the Appendix): simulation of lagone autoregressive processes, calculation of the estimators r_{1} and r_{1}+, polynomial models for different sample sizes, calculation of the estimator r_{1}, and estimation of the empirical bias in r_{1}, r_{1}+ and r_{1}. Bias runs under the Windows 3.1 or later operating system on IBMPC compatibles with a 486 or later processor, and 8 MB of extended memory.
In the first place, the program simulates firstorder autoregressive processes. The randn function (Forsythe, Malcolm, & Moler, 1977) is used to generate observations from a lagone autoregressive process:
Y_{t}= ρ_{1}Y_{t1} + e_{t}
(1)
where Y_{t} is the score on the response measure at time t, ρ_{1} is the autocorrelation parameter and e_{t} is a random normal variate with a mean of zero and a variance of one. Each time series starts with a normal variate (Y_{0}) having zero mean and variance 1/(1ρ^{2}_{1}). For each combination of n and ρ_{1}, 10,000 samples are generated. The first 30 observations of each series are dropped to eliminate any dependency between them (DeCarlo & Tryon, 1993; Huitema & Mckean, 1991, 1994a, 1994b, 1994c). After this, the program calculates the estimators r_{1} (equation 2) and r_{1}+ (equation 3) for each simulated time series. The conventional estimator r_{1}is defined by the following equation:
where Y_{t} and Y_{t+1 } are the data obtained in the intervals t and t+1
and .
The estimator r_{1} is biased with small sample sizes because
the numerator only includes n1 terms instead of n. Huitema and
McKean (1991) proposed the modified estimator r_{1}+ that corrects,
to some extent, for the negative bias generated by positive autocorrelations
r_{1}+ = r_{1}+ (1/n) (3)
The empirical bias function of the conventional estimator r_{1} mildly drifts from the linearity (Arnau, 1999; Arnau & Bono, 2001; DeCarlo & Tryon, 1993; Huitema & McKean, 1991, 1994a, 1994b). Models with significant coefficients of the empirical bias function for n= 6, 10, 20 and 30 were derived with polynomial fitting. Within the MATLAB environment, another program that calculated the empirical bias of the estimator r_{1} in function of ρ_{1} and n by Monte Carlo simulation was designed (Arnau, 1999; Arnau & Bono, 2001). The polytool function of the Statistics Toolbox of the MATLAB was used for polynomial fitting. With the significant polynomial coefficients at 5% (Table 1) the following fitting models based on n were designed and were introduced into the Bias program.
Model_{n=6}= 0.1648 0.5643r_{1} 0.0916r^{2}_{1} (4)
Model_{n=10}= 0.0972 0.3760r_{1} 0.0676r^{2}_{1} (5)
Model_{n=20}= 0.0482 0.2028r_{1} 0.0333r^{2}_{1} (6)
Model_{n=30}= 0.0373 0.1360r_{1} (7)
For sample sizes other than those studied in this paper, it is previously necessary to calculate, by simulation, the empirical biases of the conventional estimator. Once the corresponding empirical biases have been found, the polynomial fitting is carried out with polytool function specifying the ρ_{1} values, empirical biases previously obtained and the degree of the polynomial fit. In this way, the significant coefficients are obtained. These must be included in the polynomial models for different sample sizes statement indicating the value of n.
Using the equations 47, the Bias program calculates the estimator r_{1} by adding the corresponding polynomial model in absolute values to r_{1}.
r_{1}= r_{1} + Fitting Model
(8)
Finally, for each simulated sample, the program calculates the empirical bias in r_{1}, r_{1}+ and r_{1}, which is the difference between the mean value of the estimator and the value of the parameter ρ_{1}.
The Bias program only calculates r_{1} for four sample sizes (n= 6, 10, 20 and 30). For other n values, the polynomial fitting of the bias must be performed first in order to determine the corresponding correction model and to introduce it into the program.
Running the program
The Bias program is a function Mfile and the input variables in the MATLAB workspace, autoregressive parameter (rho) and sample size (n), must be introduced into it. As can be seen in the Appendix, the first line of the function Mfile specifies the file name (bias), the input variables names (rho and n) and output variables names (r1_bias, r1plus_bias and r1prime_bias). The following expression must be specified in the MATLAB workspace in order to run the program: [r1_bias, r1plus_bias, r1prime_bias]= bias(rho,n). The MATLAB executes the commands in Bias program and the results are shown in the MATLAB workspace.
Availability
Interested users who prefer not to type the program can request a file of the program by emailing the second author (rbono@psi.ub.es). It has been designed with the Version 5.2 of MATLAB but it can also work with previous versions.
Acknowledgements
The research was supported by Grant PS950228 from the General Council of Scientific and Technical Investigation, resolution of the Secretary of State of Universities and Investigation of the Ministry of Education and Culture (Spain).
APPENDIX
Program Listing
function [r1_bias, r1_plus_bias, r1_prime_bias]=bias(rho,n)
% ****************************************************
% LAGONE AUTOREGRESSIVE PROCESSES SIMULATION
% ****************************************************
% Number of simulations
for s=1:10000
% Number of observations
obs=30+n;
% Variance
variance=1/(1rho^2);
% Standard deviation
sd=sqrt(variance);
% Series with mean 0 and fixed standard deviation
y0=normrnd(0,sd,obs,1);
% Random series with mean 0 and variance 1
e=randn(obs,1);
% First value of the series
y(1,1)=y0(1)+e(1);
% Next values of the series
for t=2:obs
y(t,1)=rho*y(t1,1)+e(t);
t=t+1;
end
% Eliminations of the first 30 observations
y=y(31:obs);
% **************************************************
% CALCULATION OF r1
% **************************************************
y_mean=mean(y);
denominator=(yy_mean).^2;
denominator_sum=sum(denominator);
y_lag=y(2:n);
y=y(1:n1);
y_lag_diff=y_lagy_mean;
y_diff=yy_mean;
numerator=y_lag_diff.*y_diff;
numerator_sum=sum(numerator);
r1(s,1)=numerator_sum/denominator_sum;
end
% **************************************************
% CALCULATION OF r1+
% **************************************************
r1_plus=r1+(1/n);
% ****************************************************
% POLYNOMIAL MODELS FOR DIFFERENT SAMPLE SIZES
% ****************************************************
if n==6
model = [(0.1648)+r1.*0.5643+(r1.^2).*0.0916];
elseif n==10
model = [(0.0972)+r1.*0.3760+(r1.^2).*0.0676];
elseif n==20
model = [(0.0482)+r1.*0.2028+(r1.^2).*0.0333];
elseif n==30
model = [(0.0373)+r1.*0.1360];
end
% **************************************************
% CALCULATION OF r1
% **************************************************
r1_prime=r1+model;
% **************************************************
% EMPIRICAL BIAS IN r1, r1+ AND r1
% **************************************************
r1_bias=mean(r1)rho;
r1_plus_bias=mean(r1_plus)rho;
r1_prime_bias=mean(r1_prime)rho;
