help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Constrained non linear regression using ML


From: Corrado
Subject: Re: Constrained non linear regression using ML
Date: Wed, 17 Mar 2010 10:48:19 +0000
User-agent: Thunderbird 2.0.0.23 (X11/20090817)

Dear Friedrik, Jaroslav, Octave,

Yes, of course it depends on the error. But you can still build a frequency distribution with the y_j. It was additional information on the problem, but probably not very useful, apologies.

First of all, the {p1,....,pn} and hence the {k1,....,kn} can have a few tenths of elements (that is n could be maybe 60 in the worst case). The problem is that for such a case we use millions of observations ;). In the case of the 40,000 observation it would be safe to suppose we would use a max of 20.

At the moment I have a few of assumptions on the error:

1) Assumption 1:

error is normally distributed with mean 0. In this case I can use NLS, what do you think?

2) Assumption 2:

error is normally distributed with mean 0 AFTER inverse transformation, so you fit on x~ k0+k1*p1+ .... + kn*pn using NLS again. What do you think?

3) Assumption 3:

error is beta distributed. I have no idea.

4) Assumption 4:

error is beta distributed AFTER inverse transformation. I have no idea.

5) Assumption 5:

error is distributed with a distribution in the exponential family .... I have no clue.

6) Assumption 6:

error is distributed with a distribution in the exponential family after transformation .... I have no clue.

Assumption 3 to 6 are the most important, and I would like to be able to build a very generic package that covers most of the important cases. I would also like to be able to change the link function but that is for much later.

What do you suggest?

Fredrik Lingvall wrote:
On 03/17/10 10:01, Corrado wrote:
Dear Fredrik, dear Octave friends,

First of all thanks for coming back to me.

I have 40,000 vectors of observations: {y,p1,p2,p3,p4,p5 ..... pn}_j where j spans over the 40,000 vectors (that is from 1 to 40,000).

The k={k0,k1,k2,k3,k4,.....,kn} is the vector of parameters to be
determined by fitting.

I believe you are right, in machine lerning language, the
{1,p1,p2,....,pn} vector would be called the input.

y is the response variable.

PS: If you build a frequency histogram from the y_j, the distribution
looks approximately beta, but fails tests because of the number of
points ....

Best,

OK, then you have lots of data (which is good :-) How, largerror is n (the
length of your data vector)?

Note that your data, y,  is not distributed at all - this is what you
actually know. Your knowledge about the model parameters will be
distributed since your model is not perfect in the sense that you always
have measurement uncertainties and model uncertainties. This is also why
you have the error (or model miss-fit) term e,

y = 1 - exp(-k'*p) + e.

Essentially, this is a parameter estimation problem and how you obtain
your estimates (of k) depends on what you know about the parameters (do
you know a bound on them, mean value, variance etc.) and what you know
about your error, e (a conservative assumption is to use a zero-mean
Gaussian distribution for e).

/Fredrik

Fredrik Lingvall wrote:
On 03/16/10 20:01, Corrado wrote:
Dear Octave users,

I have to fit the non linear regression:

y~1-exp(-(k0+k1*p1+k2*p2+ .... +kn*pn))

where ki>=0 for each i in [1 .... n] and pi are on R+.

I am using, at the moment, nls, but I would rather use a Maximum
Likelhood based algorithm. The error is not necessarily normally
distributed.

y is approximately beta distributed, and the volume of data is
medium to
large (the y,pi may have ~ 40,000 elements).

Any suggestion?

Regards
Corrado,

Can you tell us a little more about your problem?

As I understand it you have a model,

y = 1 - exp(-k'*p) + e

where k = [k_0 k_1 ... k_n]' and p = [1 p_1 p_2 ... p_n]' and where y is
your data vector, p is your "input signal" and k is the parameter vector
of your model. Have I understood you correctly?

/Fredrik



--
Corrado Topi
PhD Researcher
Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]