igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] fastgreedy.community in R, iterative function to divide hu


From: Gábor Csárdi
Subject: Re: [igraph] fastgreedy.community in R, iterative function to divide huge communities
Date: Tue, 7 Apr 2009 11:33:48 +0200

Janos,

here is a function that can do it, 'maxsize' should be less than the
number of vertices in the graph.

cut.dend <- function(graph, merges, maxsize) {

  vc <- vcount(graph)
  steps <- 0
  sizes <- c(rep(1, vc), rep(0, vc-1))
  msize <- 1
  while (msize <= maxsize) {
    act <- comm$merges[steps+1,]
    sizes[steps+vc+1] <- sizes[act[1]+1] + sizes[act[2]+1]
    if (sizes[steps+vc+1] > msize) { msize <- sizes[steps+vc+1] }
    steps <- steps + 1
  }

  steps-1
}

## test it on a random graph
test.g <- simplify(ba.game(100, m=2, dir=FALSE))
comm <- fastgreedy.community(test.g)
steps <- cut.dend(test.g, comm, 20)

## this is the right number of steps
community.to.membership(test.g, comm$merges, steps)$csize

## this is a bit too many already
community.to.membership(test.g, comm$merges, steps+1)$csize

Best,
G.

2009/4/2 Janos Moldvay <address@hidden>:
> Thanks a lot for your answer, Gábor.
>
> One thing that's unclear to me is, whats the best way to cut back the
> dendrogram to reach a desired maximum cluster size? I got these
> cluster-sizes and woud llike to cut back the dendogram so that I do not have
> clusters larger than 100 nodes.
>
> $csize
>   [1]  286  379  291  304  501 1294  872  229  771   92  401  108  210  133
> 790
>  [16]   28    7  215   16    6    7   69   53   11    2    2   55    7
> 3    3
>  [31]    2    2    2  275  194    3    2   13    6    3   11   55    2
> 2   65
>  [46]   12   44    3   29    2    2    2    2    3    2   19    2    2
> 3    3
>  [61]    2    8   20   37    2   16    2    6   37    3    4   12    2
> 2    2
>  [76]    2    3    2    2    2    2    2    2    2    7    3    3    4
> 2    5
>  [91]    3    4    4    6    7    2    3    2    3    7    2    2    2
> 2    2
> [106]    4    2    2    2    4    2    3    5    3    3    2    2    5
> 5    2
> [121]    2    2    4    5    2    2    2    4    2    2    2    2    2
> 4    2
> [136]    3    2    2    2    2    7    4    8    3    2    3    2    3
> 2    2
> [151]    2    2    2    2    2    2   12    2    2    2    2    2    2
> 2    2
> [166]    2    2    2    2    2    3    2    3    2    3    4
>
> Best,
>
> János
>
> 2009/4/2 Gábor Csárdi <address@hidden>
>>
>> Janos,
>>
>> you can certainly do this, but I am not sure that it is a good
>> solution. If you divide the communities recursively, without taking
>> the rest of the graph into account then you can get some suboptimal
>> results.
>>
>> I mean if you get six clusters with the recursive division, and you
>> also cut the original dendrogram outputted by the greedy algorithm,
>> then they will be different and nothing ensures that the recursive
>> division will be better in terms of modularity. From the top of my
>> head I would guess the opposite way, at least most of the time.
>>
>> So I would just cut the dendrogram into the desired number of
>> clusters. Or at least compare the modularity of the recursive division
>> and the 'simple cut' approach.
>>
>> Merging the membership vectors is easy, e.g. assuming you have two of
>> them: 'memb1' is longer, and 'memb2' is the replacement of the
>> 'to.rep' elements in 'memb2', you can do something like:
>>
>> ## we need this to make unique values for the clusters
>> memb2 <- memb2 + max(memb1)
>> ## and then merge them
>> memb1[ memb1==to.rep ] <- memb2
>>
>> No they are merged into 'memb1', but there might be gaps in the 'ids'
>> in 'memb1', these can be eliminated by treating 'memb1' as a factor
>> temporarily:
>>
>> memb1 <- as.integer(as.factor(memb1))-1
>>
>> Now they start at 0, and they are consecutive.
>>
>> Best,
>> Gabor
>>
>> 2009/4/1 Janos Moldvay <address@hidden>:
>> > Hello,
>> >
>> > while trying out the fastgreedy.community implementation in R I wondered
>> > if
>> > anyone tried to build a function around it which would further divide
>> > huge
>> > communities into smaller ones. It would maybe look something like this:
>> >
>> >
>> > g <- read.graph("c:\\data\\sample.ncol", format="ncol")
>> > com <- fastgreedy.community(g)
>> > memb <- community.to.membership(g, com$merges,
>> > steps=(which.max(com$modularity)-1))
>> >
>> > Now I extract nodes belonging to the biggest cluster/communities, to
>> > further
>> > devide them into smaller communities:
>> >
>> > tmp <- which(memb$membership == which.max(memb$csize)-1)-1
>> > g2 <- subgraph(g,tmp)
>> > memb2 <- community.to.membership(g2, com2$merges,
>> > steps=(which.max(com2$modularity)-1))
>> > Since I am not really familiar with R I am having trouble merging the
>> > two,
>> > (respectively n) community_membership vectors (memb and memb2) so that I
>> > have one community membership per node. Furthermore would I continue to
>> > iteratively divide large communities into smaller ones till I either
>> > reached
>> > a modularity-threshold or all communities are smaller than a defined
>> > maximum-clustersize-threshold.
>> > I'll appreciate any kind of help very much!
>> >
>> > Thanks in advance
>> > János Moldvay
>> >
>> >
>> > _______________________________________________
>> > igraph-help mailing list
>> > address@hidden
>> > http://lists.nongnu.org/mailman/listinfo/igraph-help
>> >
>> >
>>
>>
>>
>> --
>> Gabor Csardi <address@hidden>     UNIL DGM
>
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/igraph-help
>
>



-- 
Gabor Csardi <address@hidden>     UNIL DGM




reply via email to

[Prev in Thread] Current Thread [Next in Thread]