Re: regarding accuracy of fit

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regarding accuracy of fit

From:	CdeMills
Subject:	Re: regarding accuracy of fit
Date:	Tue, 19 Jul 2011 01:42:22 -0700 (PDT)

rina wrote:
> 
> Hello, PLease help me to calculate the accuracy of the fit? suppose I have
> the value of y from the experimental data and value of f from the fit now
> I
> want to know the best fit?
> 
> I am doing that suppose R=y-f;
> and then ploting R with x values this is giving the correct result but I
> HAVE TOO much data to handle so I am getting a fluctuation with straight
> line??? not getting what tto do??
> thanks in advance for help
> 
> 
The "accuracy" of the fit is given by the covariance matrix of the estimated
parameters. It can be computed as follows:
- HYPOTHESIS: the noise on your data is normal, zero mean, unknown variance
- let say that the model is y \simeq a0 + a1 * x
-construct A, the regression matrix: first column is ones(size(x)), second
column is x, and so on
- solve for theta =[a0; a1] as theta = A\y
- compute the estimates of y as ye = A*theta
- compute the noise estimate as noise = y - ye
- verify the basis hypothesis !!! Search for outliers and other problems
- compute the noise estimate variance Cn as var_noise =
sumsq(noise)/(size(A, 1)-size(A, 2))
  explanation: the denominator  contains the degrees of freedom, the number
of "free" sources, i.e. the noise e1, e2, ... which are mutually independent
minus the number of linked variables.
- the parameter covariance matrix is computed as
  iA = inv(A.'*A);
  Ctheta = iA * Cn * iA
  This matrix "explains" how the noise on the data is mapped into noise on
the estimated parameters. 
- the accuracy of the regression is obtained by testing the NULL hypothesis:
there is no regression, the components of theta are just pure noise, against
their observed value. To this end, compute their studentised residuals:
  res_theta = abs(theta)./sqrt(diag(Ctheta))
  Those numbers have to be validated as
  theta_accur = 2*(1-tcdf(res_theta, size(A, 1)-size(A, 2))
  This is the two-sided probability of having observed still greater values
of theta, given the NULL hypothesis is true. Values of 1e-3 or less tell you
that you can be very confident into the existence of a regression law
between y and x.  Values of 1% are so-so (if you reject the null hypothesis,
i.e. accept that there IS a regression between y and x, the probability of
being wrong is 1%). Values greater that 10 % clearly indicate there is a
problem.

With this arsenal, you end up with a signifiance level for your model. Note
that the accuracies are given on a per-coefficient basis. This way, you can
refine your search: introduce a x^2 term, see if the associated signifiance
level is still OK, and so on.

Regards

Pascal

--
View this message in context: 
http://octave.1599824.n4.nabble.com/regarding-accuracy-of-fit-tp3675759p3677607.html
Sent from the Octave - General mailing list archive at Nabble.com.

[Prev in Thread]

Current Thread

[Next in Thread]

regarding accuracy of fit, preeti gaikwad, 2011/07/18
- Re: regarding accuracy of fit, John Swensen, 2011/07/18
- Re: regarding accuracy of fit, Martijn, 2011/07/19
  - Message not available
    - Re: regarding accuracy of fit, Martijn, 2011/07/19
  - Newbie problem running *.m files on Octave 3.4.2 on Snow Leopard, John Helly, 2011/07/30
    - Re: Newbie problem running *.m files on Octave 3.4.2 on Snow Leopard, Ben Abbott, 2011/07/30
- Re: regarding accuracy of fit, CdeMills <=

Prev by Date: Re: regarding accuracy of fit
Next by Date: Re: Precompiled 3.4.0 for Mac lacks gnuplot
Previous by thread: Re: Newbie problem running *.m files on Octave 3.4.2 on Snow Leopard
Next by thread: Financial:Grid search + leasqr to fit 3 parameters. How to speed up?
Index(es):
- Date
- Thread