igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] graph representation in memory


From: Manu
Subject: Re: [igraph] graph representation in memory
Date: Wed, 25 Mar 2009 16:32:47 +0100

On Wed, Mar 25, 2009 at 1:13 PM, Gábor Csárdi <address@hidden> wrote:
> Emmanuel,
>
> this is great, thanks for sharing it! Nice to see that not too many
> modifications were needed to achieve it.

Thanks of igraph layered design !

> If you want a new branch on
> Launchpad, that is of course possible, just register on bazaar, push
> the branch, tell me its name and then I will add it to the list of
> igraph branches.

Great ! I will do that.

> We can potentially add it to igraph as well, and then you could choose
> between different data representations at compile time. We'll see, it
> would be a nice feature.

That could be nice, but it is also useful to choose data
representation at execution time or at least when compiling the "user"
software. Currently you can choose just by changing compilation flag
(either "igraph" or "igraphdb") or/and the include line (either
#include <igraphdb/igraph.h> or #include <igraph/igraph.h>).
For python layer, I renamed python module from igraph to igraphdb thus
you can in a same python file use igraph and igraphdb. I'll push
so-hacked python interface on launchpad too.

> I don't think that it solves the problem of
> _really_ _very_ superlarge graphs (those probably require specialized
> algorithms and not igraph), but it might be still useful to some
> people.

It always possible to develop such special algorithms over igraph.
Particularly, with database implementation it's become easier to
parallelize programs. Separated instance of igraph can query the same
graph shared by the dadabase. Each instance can also load in memory
the local working on part of the graph. It's just a vague design
idea... but it looks possible.

Anyway database implementation make possible local computation on
_really_ _very_ superlarge graphs.

And at least for many algorithms (such clustering coefficient
computation) only adj_list structure is used, but igraph always need
igraph_t structure, so 6*|E| + 2*|V| memory space is needed
(4*|E|+2*|V| for igraph_t and  2*|E| in worth case for adj_list).
With database implementation, only 2*|E| is needed for adj_list since
igraph_t "is" in the database. Furthermore performance degradation is
only caused by adj_list creation (done in O(|E|)) so it doesn't
depends of algorithms complexity.
But ok It's just permit about 3 times bigger graphs.

> Porting it to 0.6 should not be very difficult.

I'll try to do it in same time that launchpad branch creation.

> The R interface surely does not work with it, unfortunately, but
> that's a problem with the R interface, not with your code.

> May I ask you at which department/faculty/class you're doing this?

We are student from computer science and applied mathematics of
Enseeiht, an engineer school in Toulouse
(http://www.enseeiht.fr/en/index.html). At same time I do a Master 2
in IRIT (the main computer science lab of Toulouse:
http://www.irit.fr/sommaire.php3?lang=en). I work with Bruno Gaume by
who this project was tutored.

Best Regards
Emmanuel N.

> Best Regards,
> Gabor
>
> On Wed, Mar 25, 2009 at 11:18 AM, E. Navarro <address@hidden> wrote:
>> Hello,
>>
>> I was about to send a email presenting a database implementation of
>> igraph basic layer we write with 4 schoolmates as school project
>> (also...). Perhaps it might be helpful for you Mati.
>>
>> We started modifications from 0.5.1. You can find the last -working-
>> source package of our work here :
>> https://wwwsecu.irit.fr/FILEX/get?k=aXR2nQQGgzdvgcbfhf4 (open the link
>> and just wait 5sec)
>> Changes are mainly in igraph.h and basicinterface_db.c which replace
>> type_indexededgelist.c. The Makefile configuration also changes a bit
>> to renamed the compiled library.
>>
>> Currently we work on our own subversion repository. But it could be
>> better to open a new branch in launchpad bazaar, if some people are
>> interested in and If it's ok for you Gabor and Tamas.
>>
>> In few words :
>>  - it use unixOdbc,
>>  - database (mysql) slow down performances from 20 to 100 times,
>>  - we didn't test R layer,
>>  - python layer work fine...
>>
>> Also I plan to improve this implementation, mainly :
>>  - merge the code with igraph 0.6
>>  - store attributes in the database
>>  - create a cache system in physical memory
>>
>> Best regards,
>> Emmanuel N.
>>
>> 2009/3/23 Chris Wj <address@hidden>
>>>
>>> I would be interested in following your progress, learning a bit, and 
>>> making the same modifications. Please post to the mailing list as you make 
>>> updates and the such.
>>>
>>> Question, have you started modifications from the 0.6 main launchpad bazaar 
>>> branch? Or are you modifying 0.5.1/0.5.2?
>>>
>>> Thanks,
>>> Chris
>>>
>>> On Mon, Mar 23, 2009 at 1:47 PM, Gábor Csárdi <address@hidden> wrote:
>>>>
>>>> Hi Mati,
>>>>
>>>> you are completely right, it is better to use (long) int instead of
>>>> double, this was a design mistake on my part.
>>>>
>>>> As for the work, it should not be very difficult. In theory, all
>>>> functions that you need to rewrite are in type_indexededgelist.c.
>>>> Plus, there are three macros in igraph.h that you need to update or
>>>> rewrite as a function: IGRAPH_FROM, IGRAPH_TO and IGRAPH_OTHER.
>>>>
>>>> This is theoretically enough for the C library and probably also for
>>>> Python. For R you need a bit more, you need to write some conversion
>>>> functions and rewrite iterators.R as well.
>>>>
>>>> Many C functions transform the igraph_t into a plain adjacency list,
>>>> or one containing edge ids. For better performance you would need to
>>>> rewrite these too, they are igraph_adjlist_t and igraph_adjedgelist,
>>>> plus lazy versions of these. You can rewrite these just to use the
>>>> normal operations of your new data structure.
>>>>
>>>> Maybe I forgot some minor things, but I thing roughly this is it.
>>>>
>>>> Of course I can help you if you decide to work on this, assuming you
>>>> need any help. :)
>>>>
>>>> May I ask you what kind of data structure you have in mind? Do you
>>>> intend to put it into igraph, or the goal is just to play with it a
>>>> bit?
>>>>
>>>> Btw. I am not sure what you can actually do with a graph with 100
>>>> million nodes and a general graph library. You cannot calculate too
>>>> many things and perhaps you're better off coding them without using
>>>> igraph at all. But I understand that this is a school project. :)
>>>>
>>>> Best,
>>>> Gabor
>>>>
>>>> On Mon, Mar 23, 2009 at 6:15 PM, Mati Vait <address@hidden> wrote:
>>>> > Hello,
>>>> > as a school project we are investigating scalability of graph algorithms
>>>> > on very large undirected graphs. Graphs under question have hundreds of
>>>> > millions of nodes, billions of edges), but we have run into some memory
>>>> > issues with igraph because of it's way of holding graph in memory
>>>> > exhausts ram really fast. We reached this conclusion by looking at
>>>> > comments on igraph_t in include/igraph.h and testing on cluster with
>>>> > 32GB ram for each node. At the moment we are able to hold in memory a
>>>> > graph with 100 million nodes and 300 million edges which uses about 27.9
>>>> > GB of ram (after tweaking igraph_real_t and igraph_integer_t types to
>>>> > use int instead of double).
>>>> > Questions:
>>>> > How difficult it is to implement another way of graph representation
>>>> > instead of current one? Some loss in work time could be tolerated.
>>>> >
>>>> > What code must be reimplemented in order to keep algorithms running?
>>>> >
>>>> >
>>>> >
>>>> > Some insight and references to specific documentation would be really
>>>> > helpful.
>>>> >
>>>> > Thank you for your time,
>>>> > Mati
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > igraph-help mailing list
>>>> > address@hidden
>>>> > http://lists.nongnu.org/mailman/listinfo/igraph-help
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Gabor Csardi <address@hidden>     UNIL DGM
>>>>
>>>>
>>>> _______________________________________________
>>>> igraph-help mailing list
>>>> address@hidden
>>>> http://lists.nongnu.org/mailman/listinfo/igraph-help
>>>
>>>
>>> _______________________________________________
>>> igraph-help mailing list
>>> address@hidden
>>> http://lists.nongnu.org/mailman/listinfo/igraph-help
>>>
>>
>>
>> _______________________________________________
>> igraph-help mailing list
>> address@hidden
>> http://lists.nongnu.org/mailman/listinfo/igraph-help
>>
>
>
>
> --
> Gabor Csardi <address@hidden>     UNIL DGM
>
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/igraph-help
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]