igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Choosing between different methods of detecting communities


From: Roey Angel
Subject: Re: [igraph] Choosing between different methods of detecting communities
Date: Tue, 09 Oct 2012 12:51:18 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120912 Thunderbird/15.0.1

Hi Tamas,
Thanks again. This is exactly my prob. I know about membership() but that produces an an numeric not a list. Even coercing it with as.list() doesn't do the trick and the function doesn't work.
The problem is to convert membership to a list of communities containing each it's respective nodes, but I can't figure out how to do that. I'm just not familiar enough with these objects to even understand where exactly the node names are stored.

Roey


On 10/08/2012 11:56 PM, Tamás Nepusz wrote:
Hi Roey, 

1. This is rather technical and prob. stems from my lack of proper familiarity with igraph. In your minimal examples you generate a mock membership list mcs which you then parse into the function along with the graph object. I was unable to generate a similar list from the community object one gets from the community detection algorithm.
Use the membership() function on the result of the algorithm:

library(igraph)
g <- grg.game(100, 0.2)
cl <- fastgreedy.community(g)
membership(cl)
[1] 4 2 2 2 2 4 2 2 4 4 2 4 4 2 2 2 4 2 2 4 4 4 4 4 4 4 4 4 4 4 4 2 2 4 4 2 4
[38] 4 4 2 4 4 2 2 4 2 3 2 4 2 2 4 3 2 1 3 3 3 3 3 3 3 1 3 3 3 1 3 3 3 1 1 3 3

[75] 3 3 1 3 3 3 3 1 1 3 1 3 1 3 3 3 1 1 1 1 1 1 3 1 3 3

See also ?communities in R.

2. If I understood it correctly, your test tests each community individually and reports its statistic and p value.
Yes, exactly.
 
Following this logic, I guess one should: 1. only report significant communities and 2. choose the community detection method which yields the highest ratio of (sig. communities / total communities). Would you agree?
Well, unfortunately it depends on many other factors. First of all, statistical tests tend to work with large samples. In other words, if your community is large, it is likely that you can trust the result of the test. On the other hand, if your community is small, you might be better off with keeping the community even if it has a large p-value (i.e. smaller significance) because statistical tests tend to be conservative - so they could potentially report higher p-values only because the community is too small to draw conclusions from. I'd rather use the p-values as a rough guideline than a strict criterion.

(As an example, the Mann-Whitney U test uses a normality assumption for its test statistic, and the test statistic itself can be well-approximated with a normal distribution only if the sample size is large).

Best,
Tamas


Attachment: angel.vcf
Description: Vcard


reply via email to

[Prev in Thread] Current Thread [Next in Thread]