Dear GSL developers,
I think there is an error in equation defining the lag-1
autocorrelation in the manual (Statistics->Autocorrelation)
https://www.gnu.org/software/gsl/manual/html_node/Autocorrelation.html
Currently it is written as:
a_1 = {\sum_{i = 1}^{n} (x_{i} - \Hat\mu) (x_{i-1} - \Hat\mu)
\over
\sum_{i = 1}^{n} (x_{i} - \Hat\mu) (x_{i} - \Hat\mu)}
and since the summation starts from 1, for i=1 the numerator refers to
x_{1-1} which is undefined (if x_0 is defined it will be ignored by
summation in the denominator, which is strange).
I think instead the equation should be written as:
a_1 = {\sum\limits_{i = 1}^{N-1} (x_{i} - \Hat\mu) (x_{i+1} -
\Hat\mu)
\over
\sum\limits_{i = 1}^{N} (x_{i} - \Hat\mu)^2}
The error is in the manual, not in the code.
The pice of code below shows that the suggested equation is consistent
with what is actually implemented in the GSL code. The difference
between values returned by my naive implementation and the library
implementation seems to be <1e-16 which is expected for double
precision computations.
With best wishes,
Kirill Sokolovsky
//################ compare_lag1.c ################//
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_statistics_double.h>
double my_lag1_autocorrelation(double * data, int N){
int i;
double u,d,mean;
mean=gsl_stats_mean( data, 1, N);
u=d=0.0;
for(i=0;i<N-1;i++)u+=(data[i]-mean)*(data[i+1]-mean);
for(i=0;i<N;i++)d+=(data[i]-mean)*(data[i]-mean);
return u/d;
}
int main(){
double r_my,r_gsl;
double data[10000];
int i,N;
N=0;
while( -1<fscanf(stdin,"%lf",&data[N]) )N++;
r_gsl=gsl_stats_lag1_autocorrelation( data, 1, N);
r_my=my_lag1_autocorrelation( data, N);
fprintf(stdout,"%lf - %lf = %lg\n",r_gsl,r_my,r_gsl-r_my);
return 0;
}