Re: [igraph] graph representation in memory

On Mon, Mar 23, 2009 at 1:47 PM, Gábor Csárdi <address@hidden> wrote:

Hi Mati,

you are completely right, it is better to use (long) int instead of
double, this was a design mistake on my part.

As for the work, it should not be very difficult. In theory, all
functions that you need to rewrite are in type_indexededgelist.c.
Plus, there are three macros in igraph.h that you need to update or
rewrite as a function: IGRAPH_FROM, IGRAPH_TO and IGRAPH_OTHER.

This is theoretically enough for the C library and probably also for
Python. For R you need a bit more, you need to write some conversion
functions and rewrite iterators.R as well.

Many C functions transform the igraph_t into a plain adjacency list,
or one containing edge ids. For better performance you would need to
rewrite these too, they are igraph_adjlist_t and igraph_adjedgelist,
plus lazy versions of these. You can rewrite these just to use the
normal operations of your new data structure.

Maybe I forgot some minor things, but I thing roughly this is it.

Of course I can help you if you decide to work on this, assuming you
need any help. :)

May I ask you what kind of data structure you have in mind? Do you
intend to put it into igraph, or the goal is just to play with it a
bit?

Btw. I am not sure what you can actually do with a graph with 100
million nodes and a general graph library. You cannot calculate too
many things and perhaps you're better off coding them without using
igraph at all. But I understand that this is a school project. :)

Best,
Gabor

On Mon, Mar 23, 2009 at 6:15 PM, Mati Vait <address@hidden> wrote:
> Hello,
> as a school project we are investigating scalability of graph algorithms
> on very large undirected graphs. Graphs under question have hundreds of
> millions of nodes, billions of edges), but we have run into some memory
> issues with igraph because of it's way of holding graph in memory
> exhausts ram really fast. We reached this conclusion by looking at
> comments on igraph_t in include/igraph.h and testing on cluster with
> 32GB ram for each node. At the moment we are able to hold in memory a
> graph with 100 million nodes and 300 million edges which uses about 27.9
> GB of ram (after tweaking igraph_real_t and igraph_integer_t types to
> use int instead of double).
> Questions:
> How difficult it is to implement another way of graph representation
> instead of current one? Some loss in work time could be tolerated.
>
> What code must be reimplemented in order to keep algorithms running?
>
>
>
> Some insight and references to specific documentation would be really
> helpful.
>
> Thank you for your time,
> Mati
>
>
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/igraph-help
>

--
Gabor Csardi <address@hidden> UNIL DGM

_______________________________________________
igraph-help mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/igraph-help

From:	Chris Wj
Subject:	Re: [igraph] graph representation in memory
Date:	Mon, 23 Mar 2009 18:35:57 -0400