Psicothema

Psicothema, 2002. Vol. Vol. 14 (nº 3). 669-672

A PROGRAM TO CALCULATE THE EMPIRICAL BIAS IN AUTOCORRELATION ESTIMATORS

Jaume Arnau y Roser Bono

University of Barcelona

The statistical analysis of short time series designs is influenced by the presence of serial dependence. Therefore, it is important to correctly estimate the first-order autocorrelation in behavioral data. The empirical bias is an indicator generally used to evaluate the adequacy degree of an estimator. This paper presents the Bias program, a Monte Carlo simulation program that generates first-order autoregressive processes and calculates the bias in three autocorrelation estimators (r₁, r₁+ and r₁’) for different values of the lag-one autocorrelation parameter and sample sizes. The program has been designed with MATLAB programming language and it runs on IBM-PC compatible computers with a 486 or later processor.

Programa para el cálculo del sesgo empírico de estimadores de autocorrelación. El análisis estadístico de los diseños de series temporales cortas está influido por la presencia de dependencia serial. De ahí la importancia de estimar correctamente la autocorrelación de primer orden en datos conductuales. El sesgo empírico es un indicador generalmente usado para evaluar el grado de precisión de un estimador. Este artículo presenta el programa Bias, un programa de simulación Monte Carlo que genera procesos autorregresivos de primer orden y calcula el sesgo de tres estimadores de autocorrelación (r₁, r₁+ y r₁’) para diferentes valores del parámetro de autocorrelación de retardo uno y tamaños muestrales. El programa ha sido diseñado mediante el lenguaje de programación MATLAB y funciona en ordenadores IBM-PC con un procesador 486 o superior.

The study of autocorrelation in time series was considered from the end of the seventies (Bono & Arnau, 1996, 2000; Busk & Marascuilo, 1988; Escudero & Vallejo, 2000; Glass, Willson, & Gottman, 1975; Gorsuch, 1983; Hartman et al., 1980; Huitema, 1985, 1988; Jones, Vaught, & Weinrott, 1977; Rosel, Jara, & Oliver, 1999; Sharpley & Alavosius, 1988; Suen, 1987; Suen & Ary, 1987; Vallejo, 1994, among others). In the nineties, the works were oriented towards the correction of the bias generated in short time-series by the conventional autocorrelation estimator r₁. Matyas and Greenwood (1991) verified that the difference between the mean of the conventional estimator r₁ and the value of the autocorrelation parameter ρ₁ increases with short time series. These authors concluded that the relationship between ρ₁ and the mean r₁ was nonlinear. The greatest coincidence between calculated mean r₁ and ρ₁ occurred when ρ₁ was around -0.2 or -0.3. According to these values, when ρ₁ increased positively, the calculated mean r₁ underestimated r₁ with a negative bias, while when the ρ₁ increased negatively, the calculated mean r₁ underestimated ρ₁, but with a positive bias. Based on this simulation study, it was deduced that when small sample sizes were used, the r₁ values were biased. Huitema and McKean (1991) proposed the modified estimator r₁+. This estimator corrects the empirical bias for positive values of the autocorrelation parameter by adding 1/n to r₁. Arnau (1999) and Arnau and Bono (2001) proposed the estimator r₁’ which consists of the correction of r₁ by the absolute value of a fitting model. This model is obtained from the polynomial function of the bias for each sample size. The estimator r₁’ reduces the bias for small sample sizes, both for positive as well as negative values of ρ₁.

This article describes a computer program using MATLAB, Version 5.2 (1998). The program, called Bias, makes it possible to calculate the empirical bias in r₁, r₁+ and r₁’ by Monte Carlo simulation. This program requires the specification of the parameter ρ₁ and the sample size (n). The structure of the program is as follows (see the Appendix): simulation of lag-one autoregressive processes, calculation of the estimators r₁ and r₁+, polynomial models for different sample sizes, calculation of the estimator r₁’, and estimation of the empirical bias in r₁, r₁+ and r₁’. Bias runs under the Windows 3.1 or later operating system on IBM-PC compatibles with a 486 or later processor, and 8 MB of extended memory.

In the first place, the program simulates first-order autoregressive processes. The randn function (Forsythe, Malcolm, & Moler, 1977) is used to generate observations from a lag-one autoregressive process:

Y_t= ρ₁Y_t-1 + e_t (1)

where Y_t is the score on the response measure at time t, ρ₁ is the autocorrelation parameter and e_t is a random normal variate with a mean of zero and a variance of one. Each time series starts with a normal variate (Y₀) having zero mean and variance 1/(1-ρ²₁). For each combination of n and ρ₁, 10,000 samples are generated. The first 30 observations of each series are dropped to eliminate any dependency between them (DeCarlo & Tryon, 1993; Huitema & Mckean, 1991, 1994a, 1994b, 1994c). After this, the program calculates the estimators r₁ (equation 2) and r₁+ (equation 3) for each simulated time series. The conventional estimator r₁is defined by the following equation:

where Y_t and Y_t+1 are the data obtained in the intervals t and t+1

and . The estimator r₁ is biased with small sample sizes because the numerator only includes n-1 terms instead of n. Huitema and McKean (1991) proposed the modified estimator r₁+ that corrects, to some extent, for the negative bias generated by positive autocorrelations

r₁+ = r₁+ (1/n) (3)

The empirical bias function of the conventional estimator r₁ mildly drifts from the linearity (Arnau, 1999; Arnau & Bono, 2001; DeCarlo & Tryon, 1993; Huitema & McKean, 1991, 1994a, 1994b). Models with significant coefficients of the empirical bias function for n= 6, 10, 20 and 30 were derived with polynomial fitting. Within the MATLAB environment, another program that calculated the empirical bias of the estimator r₁ in function of ρ₁ and n by Monte Carlo simulation was designed (Arnau, 1999; Arnau & Bono, 2001). The polytool function of the Statistics Toolbox of the MATLAB was used for polynomial fitting. With the significant polynomial coefficients at 5% (Table 1) the following fitting models based on n were designed and were introduced into the Bias program.

Model_n=6= -0.1648 – 0.5643r₁ – 0.0916r²₁ (4)

Model_n=10= -0.0972 – 0.3760r₁ – 0.0676r²₁ (5)

Model_n=20= -0.0482 – 0.2028r₁ – 0.0333r²₁ (6)

Model_n=30= -0.0373 – 0.1360r₁ (7)

For sample sizes other than those studied in this paper, it is previously necessary to calculate, by simulation, the empirical biases of the conventional estimator. Once the corresponding empirical biases have been found, the polynomial fitting is carried out with polytool function specifying the ρ₁ values, empirical biases previously obtained and the degree of the polynomial fit. In this way, the significant coefficients are obtained. These must be included in the polynomial models for different sample sizes statement indicating the value of n.

Using the equations 4-7, the Bias program calculates the estimator r₁’ by adding the corresponding polynomial model in absolute values to r₁.

r₁’= r₁ + Fitting Model (8)

Finally, for each simulated sample, the program calculates the empirical bias in r₁, r₁+ and r₁’, which is the difference between the mean value of the estimator and the value of the parameter ρ₁.

The Bias program only calculates r₁’ for four sample sizes (n= 6, 10, 20 and 30). For other n values, the polynomial fitting of the bias must be performed first in order to determine the corresponding correction model and to introduce it into the program.

Running the program

The Bias program is a function M-file and the input variables in the MATLAB workspace, autoregressive parameter (rho) and sample size (n), must be introduced into it. As can be seen in the Appendix, the first line of the function M-file specifies the file name (bias), the input variables names (rho and n) and output variables names (r1_bias, r1plus_bias and r1prime_bias). The following expression must be specified in the MATLAB workspace in order to run the program: [r1_bias, r1plus_bias, r1prime_bias]= bias(rho,n). The MATLAB executes the commands in Bias program and the results are shown in the MATLAB workspace.

Availability

Interested users who prefer not to type the program can request a file of the program by e-mailing the second author (rbono@psi.ub.es). It has been designed with the Version 5.2 of MATLAB but it can also work with previous versions.

Acknowledgements

The research was supported by Grant PS95-0228 from the General Council of Scientific and Technical Investigation, resolution of the Secretary of State of Universities and Investigation of the Ministry of Education and Culture (Spain).

APPENDIX

Program Listing

function [r1_bias, r1_plus_bias, r1_prime_bias]=bias(rho,n)

% ****************************************************

% LAG-ONE AUTOREGRESSIVE PROCESSES SIMULATION

% ****************************************************

% Number of simulations

for s=1:10000

% Number of observations

obs=30+n;

% Variance

variance=1/(1-rho^2);

% Standard deviation

sd=sqrt(variance);

% Series with mean 0 and fixed standard deviation

y0=normrnd(0,sd,obs,1);

% Random series with mean 0 and variance 1

e=randn(obs,1);

% First value of the series

y(1,1)=y0(1)+e(1);

% Next values of the series

for t=2:obs

y(t,1)=rho*y(t-1,1)+e(t);

t=t+1;

end

% Eliminations of the first 30 observations

y=y(31:obs);

% **************************************************

% CALCULATION OF r1

% **************************************************

y_mean=mean(y);

denominator=(y-y_mean).^2;

denominator_sum=sum(denominator);

y_lag=y(2:n);

y=y(1:n-1);

y_lag_diff=y_lag-y_mean;

y_diff=y-y_mean;

numerator=y_lag_diff.*y_diff;

numerator_sum=sum(numerator);

r1(s,1)=numerator_sum/denominator_sum;

end

% **************************************************

% CALCULATION OF r1+

% **************************************************

r1_plus=r1+(1/n);

% ****************************************************

% POLYNOMIAL MODELS FOR DIFFERENT SAMPLE SIZES

% ****************************************************

if n==6

model = -[(-0.1648)+r1.*-0.5643+(r1.^2).*-0.0916];

elseif n==10

model = -[(-0.0972)+r1.*-0.3760+(r1.^2).*-0.0676];

elseif n==20

model = -[(-0.0482)+r1.*-0.2028+(r1.^2).*-0.0333];

elseif n==30

model = -[(-0.0373)+r1.*-0.1360];

end

% **************************************************

% CALCULATION OF r1’

% **************************************************

r1_prime=r1+model;

% **************************************************

% EMPIRICAL BIAS IN r1, r1+ AND r1’

% **************************************************

r1_bias=mean(r1)-rho;

r1_plus_bias=mean(r1_plus)-rho;

r1_prime_bias=mean(r1_prime)-rho;

Arnau, J. (1999). Bias reduction in the estimation of autocorrelation in short time series. Metodología de las Ciencias del Comportamiento, 1, 25-37.

Arnau, J. & Bono, R. (2001). Autocorrelation and bias in short time series: an alternative estimator. Quality and Quantity, 35, 365-387.

Bono, R. & Arnau, J. (1996). C statistic power analysis through simulation. Psicothema, 8, 699-708.

Bono, R. & Arnau, J. (2000). Designs of small samples: Analysis by generalized least squares. Psicothema, 12 Suppl. 2, 87-90.

Busk, P.L. & Marascuilo, L.A. (1988). Autocorrelation in single-subject research: A counterargument to the myth of no autocorrelation. Behavioral Assessment 10, 229-242.

DeCarlo, L.T. & Tryon, W.W. (1993). Estimating and testing autocorrelation with small samples: a comparison of the C-statistic to a modified estimator. Behavior Research Theory, 31, 781-788.

Escudero, J.R. & Vallejo, G. (2000). Comparison of three alternative methods for the analysis of interrupted time-series. Psicothema, 12, 480-486.

Forsythe, G.E., Malcolm, M.A. & Moler, C.B. (1977). Computer methods for mathematical computations. Englewood Cliffs, NJ: Prentice Hall.

Glass, G.V., Willson, V.L. & Gottman, J.M. (1975). Design and analysis of time series experiments. Boulder: Colorado Associated University Press.

Gorsuch, R.L. (1983). Three Methods for analyzing limited time-series (N of 1) data. Behavioral Assessment, 5, 141-154.

Hartmann, D.P., Gottman, J.M., Jones, R.R., Gardner, W., Kazdin, A.E. & Vaught, R.S. (1980). Interrupted time-series analysis and its application to behavioral data. Journal of Applied Behavior Analysis 13, 543-559.

Huitema, B.E. (1985). Autocorrelation in applied behavior analysis: A myth. Behavioral Assessment 7, 107-118.

Huitema, B.E. (1988). Autocorrelation: 10 years of confusion. Behavioral Assessment 10, 253-294.

Huitema, B.E. & McKean, J.W. (1991). Autocorrelation estimation and inference with small samples. Psychological Bulletin, 110, 291-304.

Huitema, B.E. & McKean, J.W. (1994a). Two reduced-bias autocorrelation estimators: r_F1and r_F2. Perceptual and Motor Skills, 78, 323-330.

Huitema, B.E. & McKean, J.W. (1994b). Reduced bias autocorrelation estimation: Three jackknife methods. Educational and Psychological Measurement, 54, 654-665.

Huitema, B.E. & McKean, J.W. (1994c). Test of H₀: ρ₁= 0 for autocorrelation estimators: r_F1 and r_F2. Perceptual and Motor Skills, 78, 331-336.

Jones, R.R., Vaught, R.S. & Weinrott, M. (1977). Time-series analysis in operant research. Journal of Applied Behavior Analysis 10, 151-166.

MATLAB. (1998). MATLAB (Version 5.2). Natick, MA: The MathWorks, Inc.

Matyas, T.A. & Greenwood, K.M. (1991). Problems in the estimation of autocorrelation in brief time series and some implications for behavioral data. Behavioral Assessment, 13, 137-157.

Rosel, J., Jara, P. & Oliver (1999). Cointegration in multivariate time series. Psicothema, 11, 409-419.

Sharpley, C.F. & Alavosius, M.P. (1988). Autocorrelation in behavioral data: An alternative perspective. Behavioral Assessment 10, 243-251.

Suen, H.K. (1987). On the epistemology of autocorrelation in applied behavior analysis. Behavioral Assessment 9, 113-124.

Suen, H.K. & Ary, D. (1987). Autocorrelation in applied behavior analysis: Mith or reality?. Behavioral Assessment 9, 125-130.

Vallejo, G. (1994). Evaluation of the effects of intervention in designs of time series in the presence of trends. Psicothema, 6, 503-524.

INFORMATION

PSICOTHEMA

CONTACT US

A PROGRAM TO CALCULATE THE EMPIRICAL BIAS IN AUTOCORRELATION ESTIMATORS