[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How To Create the Linkage Function
From: |
Bill Denney |
Subject: |
Re: How To Create the Linkage Function |
Date: |
Sun, 22 Oct 2006 23:14:21 -0400 |
User-agent: |
Thunderbird 1.5.0.7 (Windows/20060909) |
OK, I figured this out with more reading. Attached is a partial
implementation.
Bill
Bill Denney wrote:
I'm trying to do some clustering analysis. To do this, I was trying to
write the linkage function. It takes in a vector of the distances
between two points and returns a clustered set of linkages.
The challenge I've come up against is: the distances are all that is
passed in. That is fine for just finding the closest pairs, but when
you want to join one cluster with a point or another cluster, I don't
see an easy way to determine the location of a cluster. What I've found
so far is that I have to determine a new coordinate space for the
elements and place each element into that coord space. I can't think of
a simple algorithm to place objects into a coordinate space using their
distances. Also, since the distances were determined in a generalized
way, I'm not sure that what I've said above is even correct.
Any pointers?
Bill
_______________________________________________
Help-octave mailing list
address@hidden
https://www.cae.wisc.edu/mailman/listinfo/help-octave
## Copyright (C) 2006 Bill Denney <address@hidden>
##
## This file is part of Octave.
##
## Octave is free software; you can redistribute it and/or modify it
## under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2, or (at your option)
## any later version.
##
## Octave is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with Octave; see the file COPYING. If not, write to the Free
## Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
## 02110-1301, USA.
## -*- texinfo -*-
## @deftypefn {Function File} address@hidden =} linkage (@var{x})
## @deftypefnx {Function File} address@hidden =} linkage (@var{x}, @var{method})
## Return clusters generated from a distance vector created by the pdist
## function.
##
## Methods can be:
##
## @table @samp
## @item "single" (default)
## Shortest distance between two clusters (aka a minimum spanning tree)
##
## @item "complete"
## Furthest distance between two clusters
##
## @item "average"
## Unweighted average distance (aka group average)
##
## @item "weighted"
## Weighted average distance
##
## @item "centroid"
## Centroid distance (not implemented)
##
## @item "median"
## Weighted center of mass distance
##
## @item "ward"
## Inner squared distance (minimum variance)
##
## @end table
##
## @var{x} is an ((m-1)*(m/2) x 1) distance vector as generated by
## pdist, and the output, @var{y} is an ((m - 1) x 3) vector defined
## with columns where the first and second columns are the cluster
## numbers of the two sub-clusters in the cluster, and the third column
## is the distance between those sub-clusters. The sub-clusters are
## numbered where 1 to m are the input elements, and m+1 to the end are
## subsequently defined clusters.
## @seealso{cluster,pdist}
## @end deftypefn
## Author: Bill Denney <address@hidden>
function y = linkage (x, method)
## check the input
if (nargin < 1) || (nargin > 2)
print_usage ();
elseif (nargin < 2)
method = "single";
endif
method = lower (method);
if (isempty (x))
error ("linkage: x cannot be empty");
elseif (~ isvector (x))
error ("linkage: x must be a vector");
endif
xmat = squareform (x);
sxm = size(xmat, 1);
if strcmp (method, "single")
## this is just a minimal spanning tree
y = linker (xmat, @min);
elseif strcmp (method, "complete")
y = linker (xmat, @max);
elseif strcmp (method, "average")
error ("linkage: %s is not yet implemented", method);
elseif strcmp (method, "weighted")
error ("linkage: %s is not yet implemented", method);
elseif strcmp (method, "centroid")
error ("linkage: %s is not yet implemented", method);
elseif strcmp (method, "median")
y = linker (xmat, @median);
elseif strcmp (method, "ward")
error ("linkage: %s is not yet implemented", method);
else
error ("linkage: unrecognised method");
endif
endfunction
function y = linker (matrix, findfxn)
## findfxn should return a scalar from a vector and a row from a
## matrix
y = zeros (0,3);
yidx = 0;
startsize = size (matrix, 1);
cnameidx = 1:startsize;
while (~ isscalar (matrix))
yidx++;
sxm = size (matrix, 1);
available = (tril (ones(sxm)) - eye (sxm));
## find the next link to make
[r, c] = find (findfxn (matrix(logical (available))) == matrix, 1);
## update the results
y(yidx, :) = [cnameidx(r) cnameidx(c) matrix(r, c)];
## update the indexes
cnameidx(r) = yidx + startsize;
cnameidx(c) = [];
## update the matrix (the diagonal element may be made inconsistent
## by this,but that doesn't really matter)
matrix(r,:) = findfxn (matrix([r c], :));
matrix(:,r) = matrix(r,:)';
matrix(c,:) = [];
matrix(:,c) = [];
endwhile
endfunction