On Wed, Mar 17, 2010 at 11:48 AM, Corrado <address@hidden> wrote:
Dear Friedrik, Jaroslav, Octave,
Yes, of course it depends on the error. But you can still build a
frequency distribution with the y_j. It was additional information on
the problem, but probably not very useful, apologies.
First of all, the {p1,....,pn} and hence the {k1,....,kn} can have a few
tenths of elements (that is n could be maybe 60 in the worst case). The
problem is that for such a case we use millions of observations ;). In
the case of the 40,000 observation it would be safe to suppose we would
use a max of 20.
At the moment I have a few of assumptions on the error:
1) Assumption 1:
error is normally distributed with mean 0. In this case I can use NLS,
what do you think?
Yes.
2) Assumption 2:
error is normally distributed with mean 0 AFTER inverse transformation,
so you fit on x~ k0+k1*p1+ .... + kn*pn using NLS again. What do you think?
Note that under this assumption, the problem becomes linear. That's
why it's likely to be useful as a starting guess, no matter what.
3) Assumption 3:
error is beta distributed. I have no idea.
4) Assumption 4:
error is beta distributed AFTER inverse transformation. I have no idea.
5) Assumption 5:
error is distributed with a distribution in the exponential family ....
I have no clue.
6) Assumption 6:
error is distributed with a distribution in the exponential family after
transformation .... I have no clue.
Assumption 3 to 6 are the most important, and I would like to be able to
build a very generic package that covers most of the important cases. I
would also like to be able to change the link function but that is for
much later.
What do you suggest?
Yes, these are often approached by maximum likelihood estimation
(nonlinear least squares is actually just a special case).
If you don't have a prior estimate of the error distribution
parameters, you need to estimate them as well.
ML estimates generally require a nonlinear optimization task.
Here, you also have the bounds on the parameters. The most general
approach is using sqp. For faster approaches, you can look at
lsqnonneg and pqpnonneg and use them iteratively.