Re: [igraph] "reasonable" size graphs for community detection

igraph-help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] "reasonable" size graphs for community detection

From:	Minh Nguyen
Subject:	Re: [igraph] "reasonable" size graphs for community detection
Date:	Mon, 30 Jul 2012 12:13:11 +1000

Hi Ross,

On Mon, Jul 30, 2012 at 11:44 AM, Ross Gayler <address@hidden> wrote:
> I want to find communities in a graph of ~1M vertices and will be
> running the software on a desktop PC with 16GB RAM and running 64-bit
> linux.  Is it reasonable to expect I could perform that kind of
> analysis on that kind of hardware?  What order of magnitude run-time
> should I expect (seconds, hours, days, weeks)?

The algorithm of (Blondel et al. 2008)

http://arxiv.org/abs/arXiv:0803.0476

is very fast.  Its R interface is

http://igraph.sourceforge.net/doc/R/multilevel.community.html

and its C interface is

http://igraph.sourceforge.net/doc/html/ch22s06.html#igraph_community_multilevel

I haven't used the R interface.  But from my experience with its C
interface, I was able to extract communities from the whole DBLP
dataset

http://www.informatik.uni-trier.de/~ley/db/

in about one hour. (I can't remember exactly the time, but I'm sure it
was less than 24 hours.)  The DBLP dataset has over 1,058,000 nodes.
Using the R interface would take a little longer I suspect.  Other
algorithms that are implemented in igraph have required over a day to
extract communities from DBLP.  In particular, I don't recommend that
you use the following to handle networks with over one million nodes:

* http://igraph.sourceforge.net/doc/R/fastgreedy.community.html
* http://igraph.sourceforge.net/doc/R/label.propagation.community.html
* http://igraph.sourceforge.net/doc/R/leading.eigenvector.html

> The above case of one big graph is probably the worst case scenario.
> I suspect that the edge density is rather low and that the 1M vertices
> can probably be partitioned into thousands of totally disjoint
> subgraphs (which may consitute a reasonable definition of communities
> - but I want to allow for the possibility that there may be
> communities within large, but loosely connected subgraphs).  Would
> that partitioning have a large impact on the runtime of community
> detection?

I don't know.  It can depend on which algorithm you use to extract
communities.  If you want a quick and dirty answer, use (Blondel et
al. 2008).

-- 
Regards,
Minh Van Nguyen
http://bit.ly/mvngu

[Prev in Thread]

Current Thread

[Next in Thread]

[igraph] "reasonable" size graphs for community detection, Ross Gayler, 2012/07/29
- Re: [igraph] "reasonable" size graphs for community detection, Minh Nguyen <=

Prev by Date: [igraph] "reasonable" size graphs for community detection
Next by Date: Re: [igraph] OS X installation - Mountain Lion day
Previous by thread: [igraph] "reasonable" size graphs for community detection
Next by thread: [igraph] Keeping track of reassigned IDs in C
Index(es):
- Date
- Thread