igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] How to read in a large graph (and output a sparse matrix)


From: Tamas Nepusz
Subject: Re: [igraph] How to read in a large graph (and output a sparse matrix)
Date: Mon, 1 Aug 2016 15:57:22 +0200

Yes, it's probably the best if you do the relabeling externally. Let
us know if it still doesn't work after using Read_Edgelist() with a
relabeled file.
T.


On Mon, Aug 1, 2016 at 2:37 PM, Raphael C <address@hidden> wrote:
> Thank you for the quick reply. My system is certainly 64 bit. The
> problem is just the amount of RAM
>
> g = Graph.Read_Ncol('edges.txt')
>
> uses it seems.
>
> Here is some code to produce a fake edge list that reproduces my problem.
>
> import random
>
> #Number of edges, vertices
> m = 62500000
> n = m/2
>
> for i in xrange(m):
>     fromnode = str(random.randint(0, n-1)).zfill(9)
>     tonode = str(random.randint(0, n-1)).zfill(9)
>     print fromnode, tonode
>
> If I produce a file edges.txt using this code and  then run
>
> from igraph import Graph
> g = Graph.Read_Ncol('edges.txt')
>
> it runs out of RAM.
>
> To get a better picture of the RAM usage I ran the same test with m =
> 20000000 (that is about one third of the edges and vertices).
>
> /usr/bin/time -v python ./test.py
>
> shows
>
> Maximum resident set size (kbytes): 3172988
>
> With m = 30000000 I see Maximum resident set size (kbytes): 4750440
>
> Maybe one solution is to relabel the nodes myself external so I can
> avoid the overhead of Ncol?
>
> Raphael
>
>
>
>
> On 1 August 2016 at 10:23, Tamas Nepusz <address@hidden> wrote:
>> Hello,
>>
>> Read_Edgelist() won't work because that assumes that the vertex IDs
>> are in the range [0; |V|-1] so it would create lots of isolated
>> vertices if your vertex ID range has "gaps" in it. Read_Ncol() is the
>> way to go, but it has an additional space penalty as it has to
>> maintain a mapping from the numeric IDs in the file to the range [0;
>> |V|-1].
>>
>> igraph requires 32 bytes per edge and 16 bytes per vertex to store the
>> graph itself, plus additional data structures to store the vertex/edge
>> attributes. Therefore, a graph of your size would require ~2.5 GB of
>> memory plus the attributes. 8 GB of RAM should therefore be enough --
>> however, note that Python might not be able to utilize all that
>> memory. In particular, 32-bit Python on Windows is limited to 2 or 3
>> GBs of memory (see
>> https://msdn.microsoft.com/en-us/library/aa366778(v=vs.85).aspx#memory_limits
>> ). If you happen to use a 32-bit Python on a 64-bit machine, you will
>> need to install a 64-bit Python with a corresponding igraph package
>> that is also built for 64-bit, and then try again.
>>
>> Best,
>> T.
>>
>>
>> On Mon, Aug 1, 2016 at 9:52 AM, Raphael C <address@hidden> wrote:
>>> I have 8GB of RAM and I have a simple edge list text file of size
>>> 1.2GB. It was 62500000 edges and about half that many vertices. Each
>>> line looks like
>>>
>>>      287111206 357850135
>>>
>>> I would like to read in the graph and output a sparse adjacency
>>> matrix. I am failing on all counts.  I have tried
>>>
>>>
>>> g = Graph.Read_Edgelist('edges.txt')
>>>
>>> but this fails immediately with
>>>
>>> MemoryError: Error at vector.pmt:439: cannot reserve space for vector,
>>> Out of memory
>>>
>>> This seems unrelated to the size of the graph is just a function of
>>> the node ids being large.
>>>
>>> So instead I tried
>>>
>>> g = Graph.Read_Ncol('edges.txt')
>>>
>>> This eats up all the RAM in my PC forcing me to kill the code.
>>>
>>> I fact I tested g = Graph.Read_Ncol('edges.txt') with the first 1/5 of
>>> the edges and have the same memory problem.
>>>
>>> Each node id is a 32 bit integer so the graph should fit easily in 8GB of 
>>> RAM.
>>>
>>> What can I do?
>>>
>>> Thanks very much for any help.
>>> Raphael
>>>
>>> _______________________________________________
>>> igraph-help mailing list
>>> address@hidden
>>> https://lists.nongnu.org/mailman/listinfo/igraph-help
>>
>> _______________________________________________
>> igraph-help mailing list
>> address@hidden
>> https://lists.nongnu.org/mailman/listinfo/igraph-help
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/igraph-help



reply via email to

[Prev in Thread] Current Thread [Next in Thread]