Re: [igraph] Intersect graphs with different vertices

igraph-help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Intersect graphs with different vertices

From:	Magnus Torfason
Subject:	Re: [igraph] Intersect graphs with different vertices
Date:	Fri, 30 Oct 2009 12:05:56 -0400
User-agent:	Thunderbird 2.0.0.23 (Windows/20090812)

Gábor Csárdi wrote:

Magnus Torfason wrote:
If I end up having to write the function, I'd be willing to share it(it seems pretty general), but it would be written in R, not in C,so I'm not sure if it would be suitable for inclusion in igraphitself (Gábor/Tamas?).
I think we can put it in the R version. If we end up having a C
version later, then we will just replace it with keeping the
interface, so the users don't recognize anything.


OK, here goes. I've attached the code I wrote for this. I separated this
from the igraph functions by adding ".by.name" to the function names.

Some of the code is self-evident, but some of the decisions require
judgment calls. Most important is the fact that one does not always
want to perform the same operation on vertices as edges. If an edge
is in the result, the vertices must be in the result as well, but
the converse is not true. So one may want to intersect the edges,
but keep all vertices from one graph (currently supported) or from
both (currently not supported). Function by function:

* graph.intersection.by.name()
    * Defaults to intersecting both edges and vertices
    * Supports keeping all vertices from x
    * Support keeping all vertex attributes from x
    * Supports keeping all edge attributes from x
    * Does NOT support unioning the vertices. The reason
      is that deciding what to do with the attributes in
      the case of union operations is a bit more complex

* graph.union.by.name()
    * Defaults to unioning both edges and vertices.
    * Intersecting vertices while unioning edges makes
      no sense and so is not supported
    * Does not support keeping attributes, since deciding
      how to handle attributes in a union operation is
      a bit more complex

* graph.difference.by.name()
    * Defaults to differencing edges, but keeping x vertices
    * Differencing or intersecting vertices makes no sense,
      so it is not supported
    * Unioning vertices makes some sense, but is not
      supported, since deciding how to handle attributes
      in a union operation is a bit more complex

* get.vertices.as.data.frame()
* get.edges.as.data.frame()
    * Support functions needed for the set functions
    * Can be configured to preserve attributes
    * For undirected graphs, all vertex names in
      the edge data.frame will fulfill (V1<V2)
    * The idea is that
        g == graph.data.frame(
          get.edges.as.data.frame(g,TRUE),
          is.directed(g),
          get.vertices.as.data.frame(g,TRUE)
        )
     * Could be private/hidden, but also seem useful enough
       to be public/exposed to users.

* safer.merge()
     * 'Better' merge function.
     * Needed for the difference function
     * Could be replaced with regular merge and some extra code
     * Should probably be hidden



Full code and examples are attached to the email. The run
of the example code is also included below:

Best regards,
Magnus

############


> # Load library and set parameters for better printing
> library(igraph)
> igraph.par("print.vertex.attributes", TRUE)
> igraph.par("print.edge.attributes", TRUE)
>
> # Define input graphs
> g1 <- graph.formula(a-b-c)
> V(g1)$v.attr=c(1,2,3)
> E(g1)$e.attr=c(5,7)
> g2 <- graph.formula(b-c-d)
>
> # Test the functions
> graph.intersection.by.name(g1,g2) # Vertices are intersected as well
Vertices: 2
Edges: 1
Directed: FALSE
Vertex attributes:
    name
[0]    b
[1]    c
Edges and their attributes:
    e
e [0] 'b' -- 'c'
> graph.union.by.name(g1,g2)        # Vertices are unioned as well
Vertices: 4
Edges: 3
Directed: FALSE
Vertex attributes:
    name
[0]    a
[1]    b
[2]    c
[3]    d
Edges and their attributes:

[0] 'a' -- 'b'
[1] 'b' -- 'c'
[2] 'c' -- 'd'
> graph.difference.by.name(g1,g2)   # Vertices from x (g1) are used
Vertices: 3
Edges: 1
Directed: FALSE
Vertex attributes:
    name
[0]    a
[1]    b
[2]    c
Edges and their attributes:
    e
e [0] 'a' -- 'b'
>
> # graph.intersection.by.name() has some extra parameters
> graph.intersection.by.name(g1,g2,

+ keep.x.vertices = TRUE) # Keep all x vertices (onlyintersect edges)

Vertices: 3
Edges: 1
Directed: FALSE
Vertex attributes:
    name
[0]    a
[1]    b
[2]    c
Edges and their attributes:
    e
e [0] 'b' -- 'c'
>
> graph.intersection.by.name(g1,g2,

+ keep.x.vertices = FALSE, # Keep all x vertices (onlyintersect edges)

+     keep.x.vertex.attributes = TRUE, # Don't throw away V(g1) attributes
+     keep.x.edge.attributes   = TRUE) # Don't throw away E(g1) attributes
Vertices: 2
Edges: 1
Directed: FALSE
Vertex attributes:
    name v.attr
[0]    b      2
[1]    c      3
Edges and their attributes:
    e            e.attr
e [0] 'b' -- 'c'      7
>
> # graph.difference.by.name() has some extra parameters
> graph.difference.by.name(g1,g2,
+     keep.x.vertex.attributes = TRUE, # Don't throw away V(g1) attributes
+     keep.x.edge.attributes   = TRUE) # Don't throw away E(g1) attributes
Vertices: 3
Edges: 1
Directed: FALSE
Vertex attributes:
    name v.attr
[0]    a      1
[1]    b      2
[2]    c      3
Edges and their attributes:
    e            e.attr
e [0] 'a' -- 'b'      5
>


############

# This function checks some very basic pitfalls that can cause
# incorrect merge results, and then merges using the merge function.
#
# Additionally, if indicate.origin=TRUE is specified, it will add 
# variables called was.in.x and was.in.y indicating if an entry was
# in the x/y part of the merge.
safer.merge = function(
        x, y, by = intersect(names(x), names(y)),
        by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
        sort = TRUE, suffixes = c(".x",".y"), incomparables = NULL, 
        indicate.origin=FALSE, allow.duplicates=FALSE, 
        allow.duplicates.x=allow.duplicates, 
allow.duplicates.y=allow.duplicates, 
        ...)
{
    if ( !allow.duplicates.x & (sum(duplicated(x[,by.x]))!=0) )
    {
        stop("safer.merge: Duplicates found in x, but allow.duplicates.x is 
FALSE" )
    }
    if ( !allow.duplicates.y & (sum(duplicated(y[,by.y]))!=0) )
    {
        stop("safer.merge: Duplicates found in y, but allow.duplicates.y is 
FALSE" )
    }
    stopifnot( sum(is.na(x[,by.x])) == 0 )
    stopifnot( sum(is.na(y[,by.y])) == 0 )
    
    if ( indicate.origin )
    {
        stopifnot( !("was.in" %in% names(x) ) )
        stopifnot( !("was.in" %in% names(y) ) )
        x$was.in=TRUE
        y$was.in=TRUE
    }
    
    if ( !sort )
    {
        x$.safer.merge.x.order = sequence(nrow(x))
        y$.safer.merge.y.order = sequence(nrow(y))
    }
    
    result = merge(x, y, by=by, 
        by.x=by.x, by.y=by.y, all=all, all.x=all.x, all.y=all.y,
        sort=sort, suffixes=suffixes, incomparables=incomparables, ...)
    
    if ( !sort )
    {
        result = st.sort(result, .safer.merge.x.order, .safer.merge.y.order)
        result = st.drop(result, .safer.merge.x.order, .safer.merge.y.order)
    }
      
    
    
    if ( indicate.origin )
    {
        was.in.x = paste("was.in", suffixes[1], sep="")
        was.in.y = paste("was.in", suffixes[2], sep="")
        result[ is.na(result[,was.in.x]) , was.in.x] = FALSE
        result[ is.na(result[,was.in.y]) , was.in.y] = FALSE
    }
    
    return(result)
}
    
# Return graph vertices as a data.frame.
# If keep.attributes == TRUE, the data.frame contains all vertex attributes.
get.vertices.as.data.frame = function(graph, keep.attributes=FALSE)
{
    stopifnot( length(V(graph)$name) == length(unique(V(graph)$name))) # Vertex 
names must be unique
    d = data.frame(V=V(graph)$name)
    if ( keep.attributes )
    {
        # Bind attributes if requested
        attr.names = list.vertex.attributes(graph) 
        d.a = as.data.frame(lapply( attr.names, 
                      function(attr, g){get.vertex.attribute(g,attr)},
                      graph))
        names(d.a) = attr.names
        d = cbind(d,d.a)
    }
    return(d)
}

# Return graph edges as a mergable data.frame
# If the graph is undirected, the result will have all V1 <= V2
# If keep.attributes == TRUE, the data.frame contains all edge attributes.
get.edges.as.data.frame = function(graph, keep.attributes=FALSE)
{
    stopifnot( length(V(graph)$name) == length(unique(V(graph)$name))) # Vertex 
names must be unique
    m = get.edgelist(graph)
    if ( !is.directed(graph) )
    {
        ix.rev        = m[,1] > m[,2]       # Choose which pairs to switch
        m[ix.rev,1:2] = m[ix.rev,2:1]       # Switch
    }
    d = as.data.frame(m)
    if ( keep.attributes )
    {
        # Bind attributes if requested
        attr.names = list.edge.attributes(graph) 
        d.a = as.data.frame(lapply( attr.names, 
                      function(attr, g){get.edge.attribute(g,attr)},
                      graph))
        names(d.a) = attr.names
        d = cbind(d,d.a)
    }
    return(d)
}

# Create an intersection of two graphs.
# This function correctly intersects the graphs based
# on the name attributes, such that:
#   Any vertex that is in g1 and g2 is in the result.
#   Any edge that is in g1 and g2 is in the result.
# This function preserves some attributes, but not all.
graph.intersection.by.name = function(g1, g2, 
        keep.x.vertices          = FALSE,
        keep.x.vertex.attributes = FALSE,
        keep.x.edge.attributes   = FALSE)
{
    # Only undirected graphs are supported
    stopifnot(!is.directed(g1) & !is.directed(g2))

    # Construct data.frames of nodes and edges
    dv1 = get.vertices.as.data.frame(g1, 
keep.attributes=keep.x.vertex.attributes)
    dv2 = get.vertices.as.data.frame(g2)
    de1 = get.edges.as.data.frame(g1, keep.attributes=keep.x.edge.attributes)
    de2 = get.edges.as.data.frame(g2)

    # Merge data.frames and construct result
    if ( keep.x.vertices )
        dv = dv1
    else
        dv = safer.merge(dv1, dv2)
    de = safer.merge(de1, de2)
    g  = graph.data.frame(de, directed=FALSE, vertices=dv)
    return(g)
}

# Create union of two graphs.
# This function correctly unions the graphs based
# on the name attributes, such that:
#   Any vertex that is in g1 or g2 is in the result.
#   Any edge that is in g1 or g2 is in the result.
# However, this function does NOT preserve other attributes.
graph.union.by.name = function(g1, g2)
{
    # Only undirected graphs are supported
    stopifnot(!is.directed(g1) & !is.directed(g2))

    # Construct data.frames of nodes and edges
    dv1 = get.vertices.as.data.frame(g1)
    dv2 = get.vertices.as.data.frame(g2)
    de1 = get.edges.as.data.frame(g1)
    de2 = get.edges.as.data.frame(g2)

    # Merge data.frames and construct result
    dv = safer.merge(dv1, dv2, all=TRUE)
    de = safer.merge(de1, de2, all=TRUE)
    g  = graph.data.frame(de, directed=FALSE, vertices=dv)
    return(g)
}

# Create difference of two graphs.
# This function correctly diffs the graphs based
# on the name attributes. Note that this difference applies
# only to the edges, not the vertices, so that:
#   Any vertex that is in g1 is in the result.
#   Any edge that is in g1 but not in g2 is in the result.
# However, this function does NOT preserve other attributes.
graph.difference.by.name = function(g1, g2,
            keep.x.vertex.attributes = FALSE,
            keep.x.edge.attributes   = FALSE)
{
    # Only undirected graphs are supported
    stopifnot(!is.directed(g1) & !is.directed(g2))

    # Construct data.frames of nodes and edges
    dv = get.vertices.as.data.frame(g1, 
keep.attributes=keep.x.vertex.attributes)
    de1 = get.edges.as.data.frame(g1, keep.attributes=keep.x.edge.attributes)
    de2 = get.edges.as.data.frame(g2)

    # Merge data.frames and construct result
    de = safer.merge(de1, de2, all=TRUE, indicate.origin=TRUE) # All edges
    de = de[ (de$was.in.x & !de$was.in.y) ,                    # Remove g2 edges
             setdiff(names(de),c("was.in.x","was.in.y")) ]     # Keep 
attributes from g1     
    g  = graph.data.frame(de, directed=FALSE, vertices=dv)     # The new graph
    return(g)
}


# Test/demonstration code
if (FALSE)
{

# Load library and set parameters for better printing
library(igraph)
igraph.par("print.vertex.attributes", TRUE)
igraph.par("print.edge.attributes", TRUE)

# Define input graphs
g1 <- graph.formula(a-b-c)
V(g1)$v.attr=c(1,2,3)
E(g1)$e.attr=c(5,7)
g2 <- graph.formula(b-c-d)

# Test the functions
graph.intersection.by.name(g1,g2) # Vertices are intersected as well
graph.union.by.name(g1,g2)        # Vertices are unioned as well
graph.difference.by.name(g1,g2)   # Vertices from x (g1) are used

# graph.intersection.by.name() has some extra parameters
graph.intersection.by.name(g1,g2,
    keep.x.vertices          = TRUE) # Keep all x vertices (only intersect 
edges)

graph.intersection.by.name(g1,g2,
    keep.x.vertices          = FALSE, # Keep all x vertices (only intersect 
edges)
    keep.x.vertex.attributes = TRUE, # Don't throw away V(g1) attributes
    keep.x.edge.attributes   = TRUE) # Don't throw away E(g1) attributes

# graph.difference.by.name() has some extra parameters
graph.difference.by.name(g1,g2,
    keep.x.vertex.attributes = TRUE, # Don't throw away V(g1) attributes
    keep.x.edge.attributes   = TRUE) # Don't throw away E(g1) attributes

}

[Prev in Thread]

Current Thread

[Next in Thread]

[igraph] Intersect graphs with different vertices, Magnus Torfason, 2009/10/28
- Re: [igraph] Intersect graphs with different vertices, Gábor Csárdi, 2009/10/30
  - Re: [igraph] Intersect graphs with different vertices, Magnus Torfason <=

Prev by Date: Re: [igraph] Igraph different performances
Next by Date: Re: [igraph] Igraph different performances
Previous by thread: Re: [igraph] Intersect graphs with different vertices
Next by thread: [igraph] community detection algorithms
Index(es):
- Date
- Thread