help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: algorithm for reducing rows/columns


From: Mike Miller
Subject: Re: algorithm for reducing rows/columns
Date: Sat, 18 Dec 2010 22:42:00 -0600 (CST)
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)

On Sat, 18 Dec 2010, Doug Stewart wrote:

On Sat, Dec 18, 2010 at 9:36 PM, Mike Miller
<address@hidden<address@hidden>
wrote:

This is probably more of a mathematical question than an Octave question,
per se, but someone here might be interested...

Here's the problem, in a simple form:  I have a square symmetrix matrix,
M, with all elements either zero or one.  All diagonal elements are zero.
I want to find a set of indices, I, such that M(I,I) is a zero matrix of
the largest possible size.  There will usually be more than one I that
maximizes size(M), but they are all equally good for my purposes, so I
would choose one arbitrarily.  Maybe if I always delete one of the
rows/cols with the largest number of ones, and repeat, that will do it,
but I'm not sure of that.  Any ideas?


Background:  I have a bunch of subjects, some of whom are genetically
related.  The rows and columns of matrix M correspond to subjects and a 1
in cell i,j means that subjects i and j are genetically related strongly
enough that I don't want to include both of them in a certain analysis I'm
doing.  So I want to find the largest set of subjects who are not too
closely genetically related.

We ascertained families, so we know about a lot of genetic relatives, but
it turns out that when we collect thousands of nuclear families, some of
the independently-ascertained families are very closely related (e.g., the
parents are siblings).  These relationships are discovered, in our data,
by comparing 550,000 genotypes of every possible pair of 8,000 subjects
(which would take forever but I run 120 processes concurrently and then
it's about 4 hours).

Mike
___


Interesting question!

If you do a count or sum of each row and of each col.
then sort it so that the col. with the most 1/s is on the left etc.
Then sort it so that the row with the biggest count is on top etc.
Then the bottom right should have the largest square matrix that is all
zeros.

I don't know if I am making any sense or not but with what I comprehend of
your problem I hope it helps  :-)

Doug Stewart


That's an excellent idea, Doug, and it makes perfect sense. Thanks for the suggestion. I'm also thinking that I can save memory by using just the lower triangle, but doing the same manipulation of columns that I do of rows and identifying the biggest square of zeros, just as in the symmetric case.

Mike

--
Michael B. Miller, Ph.D.
Bioinformatics Specialist
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota


reply via email to

[Prev in Thread] Current Thread [Next in Thread]