[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] problem with large graph
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] problem with large graph |
Date: |
Mon, 2 Nov 2009 19:32:43 +0000 |
Dear Uri,
I'm not an R expert (I mostly use igraph from C or Python), so that's
why I didn't answer your message so far. The size of the graph you are
working with (215K nodes, 3M edges) should be no problem for igraph's
core, at least when it has to store the edges only (without any
additional metadata). E.g., doing the following works for me in R
(version 2.8.1 on openSuSE 11.1 -- I know that this R version is
outdated, but this is the default in openSuSE):
> xs <- floor(runif(3000000)*215000)
> ys <- floor(runif(3000000)*215000)
> df <- data.frame(xs=xs, ys=ys)
> g <- graph.data.frame(df)
> ecount(g)
[1] 3e+06
> vcount(g)
[1] 215000
> dd <- degree.distribution(g)
[1] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
[6] 0.000000e+00 0.000000e+00 0.000000e+00 9.302326e-06 2.325581e-05
[...lines omitted here...]
[56] 4.651163e-06 0.000000e+00 0.000000e+00 4.651163e-06
However, it could have happened that the edge data together with the
metadata attached to each edge is too large for R or igraph itself.
(Note that there is a point when the whole graph is stored twice in
the memory, once as a data frame, once as an igraph graph, which
contains a copy for each edge attribute). So, I would first try to
construct an igraph graph without the attached attributes, i.e. create
a data frame with only profile_id_a and profile_id_b and check if it
works or not. If it works, try adding the attributes back to the data
frame one by one and find where the whole process breaks down. I'm not
familiar with R's internal details, so it could also have happened
that you are hitting an internal limit somewhere; maybe Gabor can tell
more about this.
--
Tamas