help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Constrained non linear regression using ML


From: Fredrik Lingvall
Subject: Re: Constrained non linear regression using ML
Date: Wed, 17 Mar 2010 13:10:43 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100309 Thunderbird/3.0.3

On 03/17/10 11:48, Corrado wrote:
> Dear Friedrik, Jaroslav, Octave,
>
> Yes, of course it depends on the error. But you can still build a 
> frequency distribution with the y_j. It was additional information on 
> the problem, but probably not very useful, apologies.
>
> First of all, the {p1,....,pn} and hence the {k1,....,kn} can have a few 
> tenths of elements (that is n could be maybe 60 in the worst case). The 
> problem is that for such a case we use millions of observations ;). In 
> the case of the 40,000 observation it would be safe to suppose we would 
> use a max of 20.
>
> At the moment I have a few of assumptions on the error:
>
> 1) Assumption 1:
>
> error is normally distributed with mean 0. In this case I can use NLS, 
> what do you think?
>   

The use of a Gaussian distribution for the errors comes from the fact
that the Gaussian is the maximum entropy distribution given that one
have some knowledge of the "size" of the errors (the variance). That is,
this is the distribution where one assume "as-little-as-possible" given
that one knows the variance. Therefore, it is a very conservative and
safe assumption.

Regarding the method to use, you must also specify what you know
regarding the parameters. If you have very little knowledge you can, for
example, assign a wide Gaussian for them too (with some large but,
finite variance).

What you want to compute is,

p(k|y,I) \propto p(k|I) p(y|k,I)

where p(y|k,I) is your likelihood function which is Gaussian,

p(y|k,I) = 1/(2 pi)^(n/2) * 1/det(Ce) * exp(-0.5*(y-(1-
exp(-k'*p))'*inv(Ce)*(y-(1- exp(-k'*p)))

p(k|I) is what you know about the parameters before seeing the data and
it is here where you, for example, put in bounds
of your parameters if you have them.

You can then, for example, take the parameters where you have the max of
p(k|y,I) as your best estimate, which is good if you have a single
strong peak in p(k|y,I). Note that, since your problem is non-linear you
can have many peaks p(k|y,I).


> 2) Assumption 2:
>
> error is normally distributed with mean 0 AFTER inverse transformation, 
> so you fit on x~ k0+k1*p1+ .... + kn*pn using NLS again. What do you think?
>
> 3) Assumption 3:
>
> error is beta distributed. I have no idea.
>
> 4) Assumption 4:
>
> error is beta distributed AFTER inverse transformation. I have no idea.
>
> 5) Assumption 5:
>
> error is distributed with a distribution in the exponential family .... 
> I have no clue.
>
> 6) Assumption 6:
>
> error is distributed with a distribution in the exponential family after 
> transformation .... I have no clue.
>
> Assumption 3 to 6 are the most important, and I would like to be able to 
> build a very generic package that covers most of the important cases. I 
> would also like to be able to change the link function but that is for 
> much later.
>
> What do you suggest?
>   
I'm not so sure that taking the log will help you much,

log(y) = log(1 - exp(-k'*p) + e)
          = log(e)*log(1 - exp(-k'*p))

which is not defined for y and e <= 0.


/F



reply via email to

[Prev in Thread] Current Thread [Next in Thread]