help-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] Multidimensional linear fit (and principal component anal


From: Philipp Klaus Krause
Subject: Re: [Help-gsl] Multidimensional linear fit (and principal component analysis, covariance matrix)
Date: Sun, 14 Dec 2008 14:16:07 +0100
User-agent: Mozilla-Thunderbird 2.0.0.17 (X11/20081018)

Liam Healy schrieb:
> http://www.gnu.org/software/gsl/manual/html_node/Multi_002dparameter-fitting.html
> 
> On Sat, Dec 13, 2008 at 3:53 PM, Philipp Klaus Krause <address@hidden> wrote:
>> I want to do a least squares fit of a line in 3 or 4-dimensional space
>> to 16 data points.
>> I looked at the manual, it seems gsl provides least squares linear fits
>> only for onedimensional stuff.
> 
> ??
> http://www.gnu.org/software/gsl/manual/html_node/Multi_002dparameter-fitting.html
> 

Somehow I can't see how I could use that for my problem. The y there is
still a vector of doubles, while I have points in three- or
fourdimensional space. There is that matrix X which allows fitting to
any number of functions, while I just want a line.

Thus I've tried implementing the PCA using gsl. It works fine, but
currently takes a bit more than 7 seconds to call pca() a million times
on my system. Since I'm a newbie to gsl I suppose there is a way to
improve performance. What would you suggest? Dimension will typically be
3 or 4.

struct odata
{
        int dimension;
        gsl_matrix *covariance_matrix;
        gsl_eigen_symmv_workspace *workspace;
        gsl_vector *eigenvalues;
        gsl_matrix *eigenvectors;
        gsl_vector *mean;
        gsl_matrix *mean_substracted_points;
};

static void create_pca(struct odata *data, const int dimension)
{
        #warning TODO: Check for lack of memory in this function.
        data->dimension = dimension;
        data->covariance_matrix = gsl_matrix_alloc(dimension, dimension);
        data->eigenvalues = gsl_vector_alloc(dimension);
        data->eigenvectors = gsl_matrix_alloc(dimension, dimension);
        data->mean = gsl_vector_alloc(dimension);
        data->mean_substracted_points = gsl_matrix_alloc(dimension, 16);
        data->workspace = gsl_eigen_symmv_alloc(dimension);
}

static void destroy_pca(struct odata *data)
{
        gsl_eigen_symmv_free(data->workspace);
        gsl_matrix_free(data->mean_substracted_points);
        gsl_vector_free(data->mean);
        gsl_matrix_free(data->eigenvectors);
        gsl_vector_free(data->eigenvalues);
        gsl_matrix_free(data->covariance_matrix);
}

// Do PCA, principal component can be found in first column of
data->eigenvectors.
static void pca(const double *points, struct odata *data)
{
        int i;

        for(i = 0; i < data->dimension; i++)
                gsl_vector_set(data->mean, i, gsl_stats_mean(points + i,
data->dimension, 16));

        // Get mean-substracted data into matrix mean_substracted_points.
        for(i = 0; i < 16; i++)
        {
                gsl_vector_const_view point_view = 
gsl_vector_const_view_array(points
+ i * data->dimension, data->dimension);
                gsl_vector_view mean_substracted_point_view =
gsl_matrix_column(data->mean_substracted_points, i);
                gsl_vector_memcpy(&mean_substracted_point_view.vector,
&point_view.vector);
                gsl_vector_sub(&mean_substracted_point_view.vector, data->mean);
        }

        // Compute Covariance matrix
        gsl_blas_dgemm(CblasNoTrans, CblasTrans, 1.0 / 16.0,
data->mean_substracted_points, data->mean_substracted_points, 0.0,
data->covariance_matrix);

        // Get eigenvectors, sort by eigenvalue.
        gsl_eigen_symmv(data->covariance_matrix, data->eigenvalues,
data->eigenvectors, data->workspace);
        gsl_eigen_symmv_sort(data->eigenvalues, data->eigenvectors,
GSL_EIGEN_SORT_ABS_DESC);
}

Philipp






reply via email to

[Prev in Thread] Current Thread [Next in Thread]