Re: [igraph] Choosing between different methods of detecting communities

igraph-help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Choosing between different methods of detecting communities

From:	Gábor Csárdi
Subject:	Re: [igraph] Choosing between different methods of detecting communities
Date:	Thu, 4 Oct 2012 10:22:07 -0400

On Wed, Oct 3, 2012 at 11:01 PM, 凌琛 <address@hidden> wrote:
> Hi Gabor,
>
> Many thanks. I got it. So if I want to use unweighted version to cluster the
> graph into communities. It is better to sort the edges by weights, and only
> use the top weighted edges, and discard the remaining. Then it is not a full
> graph, and the unweighted method can detect different communities.

Yes, that is one approach if you really want to use the unweighted
method. The problem with it is that it is usually hard to come up with
a sensible threshold to discard the edges. I think it is probably
better to use the weighted version of the algorithm, or another
community structure detection algorithm that also supports weights.

But it all depends on your application.

Gabor

> Best Regards,
> chen
>
>
> On Thu, Oct 4, 2012 at 10:44 AM, Gábor Csárdi <address@hidden> wrote:
>>
>> Hi Chen,
>>
>> you did not do anything wrong. But igraph is not wrong, either. Your
>> graph is a full graph. In other words, it contains all possible edges.
>> For this graph, zero modularity is actually really the maximum you can
>> get. If you split the graph into two (or more) groups, the modularity
>> always gets negative. The best is to keep them in a single community.
>> This is very easy to show analytically, and you can try it using the
>> modularity() function as well, see below. Also, 'ec' contains all
>> modularity values for the different splits along the edge-betweenness
>> splits and they are all negative:
>>
>> graph.isomorphic(graph.full(vcount(G)), G)
>> # TRUE
>>
>> modularity(G, membership=sample(1:2, vcount(G), rep=TRUE), weight=NULL)
>> # [1] -0.01278846
>> modularity(G, membership=sample(1:2, vcount(G), rep=TRUE), weight=NULL)
>> # [1] -0.01278846
>> modularity(G, membership=sample(1:2, vcount(G), rep=TRUE), weight=NULL)
>> # [1] -0.01269231
>>
>> ec$modularity
>> #  [1] -0.025000000 -0.024967949 -0.024903846 -0.024807692 -0.024679487
>> #  [6] -0.024519231 -0.024326923 -0.024102564 -0.023846154 -0.023557692
>> # [11] -0.023237179 -0.022884615 -0.022500000 -0.022083333 -0.021634615
>> # [16] -0.021153846 -0.020641026 -0.020096154 -0.019519231 -0.018910256
>> # [21] -0.018269231 -0.017596154 -0.016891026 -0.016153846 -0.015384615
>> # [26] -0.014583333 -0.013750000 -0.012884615 -0.011987179 -0.011057692
>> # [31] -0.010096154 -0.009102564 -0.008076923 -0.007019231 -0.005929487
>> # [36] -0.004807692 -0.003653846 -0.002467949 -0.001250000  0.000000000
>>
>> If you use edge weights, then things are different, of course, as some
>> parts in your graph will be denser (unless all the edge weighs are
>> equal, but this is not true for your graph), and you'll get "real"
>> communities.
>>
>> Best,
>> Gabor
>>
>> On Wed, Oct 3, 2012 at 10:24 PM, 凌琛 <address@hidden> wrote:
>> > Hi Gabor,
>> >
>> > Sorry, I went to sleep last night.
>> >
>> > I attached the codes and results here.
>> >
>> >> G = read.graph("D:/text mining project/term lists/40 core term
>> >> list/GraphML/50w_lift.graphml",  "graphml")
>> >> ec <- edge.betweenness.community(G, E(G)$weight, FALSE)
>> >> ec
>> > Graph community structure calculated with the edge betweenness algorithm
>> > Number of communities (best split): 15
>> > Modularity (best split): 0.02788969
>> > Membership vector:
>> >  [1]  1  2  3  4  5  2  2  2  6  7  2  8  2  2  2  2  2  2  2  9  2  2
>> > 2  2
>> > 2  2  2 10 11  2 12  2 13  2  2 14  2  2 15  2
>> >> ec <- edge.betweenness.community(G, NULL, FALSE)
>> >> ec
>> > Graph community structure calculated with the edge betweenness algorithm
>> > Number of communities (best split): 1
>> > Modularity (best split): 0
>> > Membership vector:
>> >  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>> > 1 1
>> > 1 1 1 1
>> >>
>> >
>> > Did I do something wrong here? The source data is in the attachment.
>> > Thanks.
>> >
>> > Best Regards,
>> > chen
>> >
>> > On Wed, Oct 3, 2012 at 11:44 PM, Gábor Csárdi <address@hidden>
>> > wrote:
>> >>
>> >> If you get just one community, then you are surely doing something
>> >> wrong (or there is a bug in igraph).
>> >>
>> >> edge.betweenness.community is working fine for our test cases, so
>> >> you'll need to show us a reproducible example that does not work the
>> >> way you expect it to.
>> >>
>> >> Gabor
>> >>
>> >> On Wed, Oct 3, 2012 at 11:34 AM, 凌琛 <address@hidden> wrote:
>> >> > yes, maximun the modularity.
>> >> >
>> >> > Actually when the result is only one community, the modularity is 0.
>> >> >
>> >> > SNAP is a small library developed by Stanford. You can take a look
>> >> > when
>> >> > you
>> >> > have time. Igraph is much more complete, thanks for the development
>> >> > and
>> >> > sharing.
>> >> >
>> >> > Regards,
>> >> > chen
>> >> >
>> >> > On Oct 3, 2012 11:14 PM, "Tamás Nepusz" <address@hidden> wrote:
>> >> >>
>> >> >> > I know that the algorithm is not deterministic. In my case, the
>> >> >> > graph
>> >> >> > only have tens of nodes, while the results are very different.
>> >> >> > In igraph, the unweighted version results in only one community;
>> >> >> > in
>> >> >> > SNAP, there are quite a few communities.
>> >> >>
>> >> >> How does SNAP select the number of communities for the edge
>> >> >> betweenness
>> >> >> method? This is not specified in the original algorithm; igraph just
>> >> >> cuts
>> >> >> the community dendrogram at the point where the modularity is
>> >> >> maximal.
>> >> >> Is it
>> >> >> the same for SNAP?
>> >> >>
>> >> >> Cheers,
>> >> >> T.
>> >> >>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> igraph-help mailing list
>> >> >> address@hidden
>> >> >> https://lists.nongnu.org/mailman/listinfo/igraph-help
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > igraph-help mailing list
>> >> > address@hidden
>> >> > https://lists.nongnu.org/mailman/listinfo/igraph-help
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Gabor Csardi <address@hidden>     MTA KFKI RMKI
>> >>
>> >> _______________________________________________
>> >> igraph-help mailing list
>> >> address@hidden
>> >> https://lists.nongnu.org/mailman/listinfo/igraph-help
>> >
>> >
>> >
>> > _______________________________________________
>> > igraph-help mailing list
>> > address@hidden
>> > https://lists.nongnu.org/mailman/listinfo/igraph-help
>> >
>>
>>
>>
>> --
>> Gabor Csardi <address@hidden>     MTA KFKI RMKI
>>
>> _______________________________________________
>> igraph-help mailing list
>> address@hidden
>> https://lists.nongnu.org/mailman/listinfo/igraph-help
>
>
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/igraph-help
>



-- 
Gabor Csardi <address@hidden>     MTA KFKI RMKI

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [igraph] Choosing between different methods of detecting communities, (continued)
- Re: [igraph] Choosing between different methods of detecting communities, Roey Angel, 2012/10/08
  - Message not available
    - Re: [igraph] Choosing between different methods of detecting communities, Roey Angel, 2012/10/09
    - Re: [igraph] Choosing between different methods of detecting communities, Gábor Csárdi, 2012/10/09
    - Re: [igraph] Choosing between different methods of detecting communities, Roey Angel, 2012/10/09

Prev by Date: Re: [igraph] help with fruchterman reingold output
Next by Date: Re: [igraph] igraph-help Digest, Vol 74, Issue 22
Previous by thread: Re: [igraph] Choosing between different methods of detecting communities
Next by thread: Re: [igraph] Choosing between different methods of detecting communities
Index(es):
- Date
- Thread