[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regarding accuracy of fit

From: CdeMills
Subject: Re: regarding accuracy of fit
Date: Tue, 19 Jul 2011 01:42:22 -0700 (PDT)

rina wrote:
> Hello, PLease help me to calculate the accuracy of the fit? suppose I have
> the value of y from the experimental data and value of f from the fit now
> I
> want to know the best fit?
> I am doing that suppose R=y-f;
> and then ploting R with x values this is giving the correct result but I
> HAVE TOO much data to handle so I am getting a fluctuation with straight
> line??? not getting what tto do??
> thanks in advance for help
The "accuracy" of the fit is given by the covariance matrix of the estimated
parameters. It can be computed as follows:
- HYPOTHESIS: the noise on your data is normal, zero mean, unknown variance
- let say that the model is y \simeq a0 + a1 * x
-construct A, the regression matrix: first column is ones(size(x)), second
column is x, and so on
- solve for theta =[a0; a1] as theta = A\y
- compute the estimates of y as ye = A*theta
- compute the noise estimate as noise = y - ye
- verify the basis hypothesis !!! Search for outliers and other problems
- compute the noise estimate variance Cn as var_noise =
sumsq(noise)/(size(A, 1)-size(A, 2))
  explanation: the denominator  contains the degrees of freedom, the number
of "free" sources, i.e. the noise e1, e2, ... which are mutually independent
minus the number of linked variables.
- the parameter covariance matrix is computed as
  iA = inv(A.'*A);
  Ctheta = iA * Cn * iA
  This matrix "explains" how the noise on the data is mapped into noise on
the estimated parameters. 
- the accuracy of the regression is obtained by testing the NULL hypothesis:
there is no regression, the components of theta are just pure noise, against
their observed value. To this end, compute their studentised residuals:
  res_theta = abs(theta)./sqrt(diag(Ctheta))
  Those numbers have to be validated as
  theta_accur = 2*(1-tcdf(res_theta, size(A, 1)-size(A, 2))
  This is the two-sided probability of having observed still greater values
of theta, given the NULL hypothesis is true. Values of 1e-3 or less tell you
that you can be very confident into the existence of a regression law
between y and x.  Values of 1% are so-so (if you reject the null hypothesis,
i.e. accept that there IS a regression between y and x, the probability of
being wrong is 1%). Values greater that 10 % clearly indicate there is a

With this arsenal, you end up with a signifiance level for your model. Note
that the accuracies are given on a per-coefficient basis. This way, you can
refine your search: introduce a x^2 term, see if the associated signifiance
level is still OK, and so on.



View this message in context:
Sent from the Octave - General mailing list archive at

reply via email to

[Prev in Thread] Current Thread [Next in Thread]