igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] How to read in a large graph (and output a sparse matrix)


From: Tamas Nepusz
Subject: Re: [igraph] How to read in a large graph (and output a sparse matrix)
Date: Mon, 1 Aug 2016 11:23:28 +0200

Hello,

Read_Edgelist() won't work because that assumes that the vertex IDs
are in the range [0; |V|-1] so it would create lots of isolated
vertices if your vertex ID range has "gaps" in it. Read_Ncol() is the
way to go, but it has an additional space penalty as it has to
maintain a mapping from the numeric IDs in the file to the range [0;
|V|-1].

igraph requires 32 bytes per edge and 16 bytes per vertex to store the
graph itself, plus additional data structures to store the vertex/edge
attributes. Therefore, a graph of your size would require ~2.5 GB of
memory plus the attributes. 8 GB of RAM should therefore be enough --
however, note that Python might not be able to utilize all that
memory. In particular, 32-bit Python on Windows is limited to 2 or 3
GBs of memory (see
https://msdn.microsoft.com/en-us/library/aa366778(v=vs.85).aspx#memory_limits
). If you happen to use a 32-bit Python on a 64-bit machine, you will
need to install a 64-bit Python with a corresponding igraph package
that is also built for 64-bit, and then try again.

Best,
T.


On Mon, Aug 1, 2016 at 9:52 AM, Raphael C <address@hidden> wrote:
> I have 8GB of RAM and I have a simple edge list text file of size
> 1.2GB. It was 62500000 edges and about half that many vertices. Each
> line looks like
>
>      287111206 357850135
>
> I would like to read in the graph and output a sparse adjacency
> matrix. I am failing on all counts.  I have tried
>
>
> g = Graph.Read_Edgelist('edges.txt')
>
> but this fails immediately with
>
> MemoryError: Error at vector.pmt:439: cannot reserve space for vector,
> Out of memory
>
> This seems unrelated to the size of the graph is just a function of
> the node ids being large.
>
> So instead I tried
>
> g = Graph.Read_Ncol('edges.txt')
>
> This eats up all the RAM in my PC forcing me to kill the code.
>
> I fact I tested g = Graph.Read_Ncol('edges.txt') with the first 1/5 of
> the edges and have the same memory problem.
>
> Each node id is a 32 bit integer so the graph should fit easily in 8GB of RAM.
>
> What can I do?
>
> Thanks very much for any help.
> Raphael
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/igraph-help



reply via email to

[Prev in Thread] Current Thread [Next in Thread]