[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] working with bipartite graphs
From: |
Gábor Csárdi |
Subject: |
Re: [igraph] working with bipartite graphs |
Date: |
Mon, 16 Mar 2009 16:53:47 +0100 |
On Mon, Mar 16, 2009 at 4:13 PM, Jose Quesada <address@hidden> wrote:
[...]
>> summary(dg)
> Vertices: 2901905
> Edges: 10962889
> Directed: FALSE
> No graph attributes.
> Vertex attributes: type.
> No edge attributes.
>
> max degree:
>
> max(stats$degree)
> [1] 5405
I have just committed a function that calculates only the size of the
bipartite projection, without the projection itself. Here is a little
test on a graph of the same size:
############################
library(igraph)
V1 <- 901905
V2 <- 2901905-V1
E <- 10962889
sam1 <- sample(V1, E, rep=TRUE)-1
sam2 <- sample(V2, E, rep=TRUE)-1+V1
g <- graph.bipartite(c(rep(TRUE, V1), rep(FALSE, V2)), rbind(sam1,sam2))
system.time(bp.size <- bipartite.projection.size(g))
user system elapsed
17.527 0.172 17.705
bp.size
$vcount1
[1] 2e+06
$ecount1
[1] 66624273
$vcount2
[1] 901905
$ecount2
[1] 30050183
max(degree(g))
[1] 31
####################################
So, even for this graph, that has low-degree vertices only, altogether
you end up with almost 100M edges in the two projections. This
requires about 3GB of memory, just to store the graphs, actually a bit
more to create them. I suspect that your projections have more edges
because of the high degree, but you can try yourself and then have a
rough estimate about the amount of RAM you need.
[...]
> I agree. I know that this could be a monumental task, but maybe
> listing which functions do not work or the method doesn't make sense,
> for bipartite graphs could be very useful. Having constructors,
> checkers etc for bipartite graphs, but an arsenal of functions that
> are not necessarily working for them is problematic.
If the manual does not say that the function supports bipartite
graphs, then it doesn't.
> Of course, one
> could always look at the code and think carefully if the results make
> sense or not, but most users may not have the skill, time or raw
> need. The warning does sound like a good solution.
[...]
> Yes, but in my case it took me a while to understand that. In other
> circumstances (i.e., had I gotten results according to my hypothesis),
> I would have run with them, not realizing they were not right.
Again, apart from page.rank, what is not right?
> This might be a matter of taste, but a warning of an error is
> preferable to returning results anyway. And slowly incorporating
> functions that do take into account bipartite networks would be the
> way to go.
Which functions? Now that I think about it, I cannot really find any.
Apart from page.rank and maybe modularity what are the functions that
should work differently for bipartite graphs? Even for page rank,
nobody has defined it on a bipartite graph I think.
> So, the rule of thumb is 'it won't work for bipartite, unless stated
> otherwise'. For example, I still don't know if clusters() provides
> sensible results, but following that rule, I should expect it doesn't.
> Is that right?
I don't know. What are sensible results? Why would the connected
components be different if you have a bipartite graph?
G.
[...]
--
Gabor Csardi <address@hidden> UNIL DGM