igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[igraph] Fwd: V(g) slow in a loop


From: Tony Larson
Subject: [igraph] Fwd: V(g) slow in a loop
Date: Thu, 18 Feb 2016 15:13:57 +0000


---------- Forwarded message ----------
From: Tony Larson <address@hidden>
Date: 15 February 2016 at 22:13
Subject: Re: [igraph] V(g) slow in a loop
To: Help for igraph users <address@hidden>


Sorry, previously trying to explain from a smartphone! 

Here's a toy example that shows an approximate 7x speed slow down if using the V() accessor in a loop.  I know using a loop in this way is pretty nonsensical, but with my real data a loop is required as I make multiple logical comparisons between several V() attributes and other external data.  In the second example below the speed increase is at the expense of creating a new vector, vx, first.  I want to avoid  this if at all possible as it seems inefficient to create copies of all the necessary V(g) attributes in R memory:

n <- 100000
edges <- as.data.frame(cbind(from = (1:n)[order(runif(n))], to =  (1:n)[order(runif(n))]))
g <- graph.data.frame(edges, directed = TRUE)
V(g)$x <- floor(runif(length(V(g)), 1, 4))

## extract V(g)$x before loop (fast)
make.vector <- system.time(vx <- V(g)$x)
> make.vector
user  system elapsed
  0.007   0.000   0.00

y <- floor(runif(5000, 1, 4))

## directly query x in the loop using the V(g) accessor (slow)
res1 <- integer(length(y))
out1 <- system.time(for(i in 1:length(y))
                {
                res1[i] <- which(V(g)$x == y[i])[1]
                })

> out1
user  system elapsed
 43.893   0.736  44.587


## use previously vectorized V(g)$x instead (fast)
res2 <- integer(length(y))
out2 <- system.time(for(i in 1:length(y))
                {
                res2[i] <- which(vx == y[i])[1]
                })

> out2
 user  system elapsed
 6.412   0.000   6.407

> all(res1 == res2)
TRUE


> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Fedora 20 (Heisenbug)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] CAMERA_1.26.0       igraph_1.0.1        xcms_1.46.0
[4] Biobase_2.30.0      ProtGenerics_1.2.1  BiocGenerics_0.16.1
[7] mzR_2.4.0           Rcpp_0.12.2

loaded via a namespace (and not attached):
 [1] graph_1.48.0        Formula_1.2-1       cluster_2.0.3
 [4] magrittr_1.5        MASS_7.3-45         splines_3.2.2
 [7] munsell_0.4.2       colorspace_1.2-6    lattice_0.20-33
[10] stringr_1.0.0       plyr_1.8.3          tools_3.2.2
[13] nnet_7.3-11         grid_3.2.2          gtable_0.1.2
[16] latticeExtra_0.6-26 survival_2.38-3     RBGL_1.46.0
[19] digest_0.6.8        gridExtra_2.0.0     RColorBrewer_1.1-2
[22] reshape2_1.4.1      ggplot2_1.0.1       acepack_1.3-3.3
[25] codetools_0.2-14    rpart_4.1-10        stringi_1.0-1
[28] scales_0.3.0        Hmisc_3.17-0        stats4_3.2.2
[31] foreign_0.8-66      proto_0.3-10




        sessionInfo()


>
> vx <- V(g)$x
>
>
> out <- system.time(for(i in 1:5000)
+ {
+ res <- which(V(g)$x == 2)
+ })
>
>
> out2 <- system.time(for(i in 1:5000)
+ {
+ res <- which(vx == 2)
+ })
>



On 15 February 2016 at 19:34, Gábor Csárdi <address@hidden> wrote:
Hi, can you send a reproducible example? See
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

Gabor

On Mon, Feb 15, 2016 at 6:28 PM, Tony Larson <address@hidden> wrote:
>
> Hi,
> I'm accessing a vertex attribute in R using V(g)$x,  where x is a named
> numeric attribute. If I do this for the whole graph (about 10e5 vertices),
> it takes a few ms to get a vector of x values,
>
> vx <- V(g)$x
>
> If I then use vx as a target vector  in an R loop to search through about
> 10e3 candidate y values for x, it takes maybe 100 ms,
>
> for(i in 1:length(y))
> {
> z <- which(vx > y[i])
> }
> However,  if I substitute V(g)$x for vx INSIDE the loop,  it takes about 5s
> - more than 50x slower. Why is this?
>
> Thanks
> Tony
>
> Dr. Tony R. Larson
> CNAP
> Department of Biology, Area 7
> University of York
> Wentworth Way
> Heslington
> York YO10 5DD
> UK
>
> Tel: +44(0)1904 328 826 (office)
> Tel: +44(0)7833 471 685 (mobile)
>
> address@hidden
>
> http://scholar.google.com/citations?user=9hLFka4AAAAJ
>
>
>
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/igraph-help
>

_______________________________________________
igraph-help mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/igraph-help





reply via email to

[Prev in Thread] Current Thread [Next in Thread]