axiom-math
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Axiom-math] address@hidden: Re: [Axiom-developer] RE: dependency visual


From: root
Subject: [Axiom-math] address@hidden: Re: [Axiom-developer] RE: dependency visualization with VCG]
Date: Sat, 30 Aug 2003 16:00:50 -0400

------- Start of forwarded message -------
X-Original-To: address@hidden
X-MessageWall-Score: 0 (mail-gw.bsdwebsolutions.com)
X-MessageWall-Score: 0 (spam-filter-less.bsdwebsolutions.com)
Date: Sat, 30 Aug 2003 00:17:05 -0400
From: William Sit <address@hidden>
Reply-To: address@hidden
Organization: City College of New York
X-Accept-Language: zh,en
To: address@hidden
Cc: address@hidden, address@hidden
Subject: Re: [Axiom-developer] RE: dependency visualization with VCG
Content-Type: text/plain; charset=us-ascii
X-Authentication-Info: Submitted using SMTP AUTH at out003.verizon.net from 
[162.83.137.57] at Fri, 29 Aug 2003 23:17:05 -0500
X-UIDL: XPb"!,hI"!p~#"!=CU!!

Tim:

As I understand this, the problem is to find a linear ordering (for
compilation) on the set of Axiom "domains" (here in this paragraph,
"domain" means either a category or domain constructor, or a package)
such that if domain A precedes domain B in the linear order, then the
compiling of A does not depend on B and only on what precedes A. You
propose a heuristic method to generate this linear order using the
idea of a kernel (set of bootstrap domains) and then the most
frequently depended-on domain among the rest of the domains.

To formulate the problem in terms of graph theory, a few
clarifications are needed. Here (in first paragraph), the set V of
vertices are the set of category constructors, domain constructors and
packages. It is not at all clear one should just limit vertices to
these three objects; it can also be argued that the set of vertices
include the set of functions as defined in these three objects; but
from your email, I'll assume the above is your intention.  For
simplicity, let's refer to elements of V as a vertex rather than a
domain so as not to cause confusion; and at the same time, if the
meaning of vertex changes, the discussion below remains valid when
suitably interpreted.

First, let's specify precisely the binary relation "depend on", that
is, the set E of directed edges (also called arcs). When we say there
is a directed edge from vertex A to vertex B if B "depends on" A
(note: a more intuitive term would be A "comes before" B, with the
understanding that cycles are allowed), do we mean some function or
reference (such as using the "with" clause) in B calls, with some
appropriate set C of parameters for A, either the construction of A
(if A is a category or domain constructor) and/or, in all cases, calls
a function from A?  This seems to Bill Page's working definition.
If so, by this definition, it should be true that
B also "depends on" any vertex in the set C AND the vertices to which
any member of C belongs (that is, a parameter in C need not be a
vertex; for example, it could be an integer).

I believe that Bill Page's algorithm may miss such a dependence as
well as adding dependences that are not real dependences (of course,
there is no harm in the latter case; but it may affect the solution to
the graph-theoretic problem).  But I have to study this more to be
sure.  Bill himself is aware of a number of problems.  In general, I
do not agree with ONLY using the source file with text matching to
decide dependencies (even though that may be a good first
approximation).  One should rely on the compiler itself (assuming it
has no bugs) to discover the dependencies.  But this creates a
catch-22 situation.  An example of a function that the compiler will
catch, but not easily caught by simple pattern matching from the spad
source is something like a + b.  With all the notation embeded in a
pamphlet file, signs like + or - appear everywhere in all sorts of
context.  Even in pure spad files, one cannot easily decide the
meaning of say the minus sign without understanding the context
where it appears.  Observe that the appearance of a declaration for
ELEMENTS in a domain, or DOMAIN in a category is enough to decide a
dependence because operations, functions, or domain constructor must
all involve operands or parameters that belong to domains or
categories.  As far as the problem of is dependence among vertices in
V, exactly which function demonstrates the dependence is not
important.  Thus, it may be reasonable to revise the "depends on" to
mean:  a vertex B depends on a vertex A if an element of A is declared
in the code for B.  Here a category is considered as the set of
domains belonging to the category; a domain is the mathematical set
defined by the construction, and a package is the set of its exported
AND private functions.

At this point, this discussion is a digression but related to how the
multi-digraph of the "depends on" relation is computed and the
subsequent efficiency.  With that in mind, let me continue with the
theoretical version.

Once this multi-digraph G = (V, E) is defined, because of possible
cycles (the word "loop" in digraph is reserved to mean
"self-dependency" in our context, that is, a directed edge from a
vertex to itself), it need not be anti-symmetric and so it is not a
partial ordering. One needs to "break" the cycles. We can easily
simplify the multi-digraph to just a di-graph (that is, we can
eliminate multiple directed edges from A to B for all pairs of
vertices (A, B).  (Bill Page used sorting to accomplish this).
We may put weights on the parallel arcs to reflect some measure of
dependency, so that we can choose the one with the least weight.

Now let's ease up a bit and consider only undirected graphs. To break
the cycles means finding a spanning forest (a forest is a disjoint
union of trees). The depth-first search algorithm accomplishes this as
well as decomposing the graph into its connected components.  If the
graph is weighted, then we can even find minimum spanning forests by
finding a minimum spanning tree for each component.  For example
Kruskal's or Prim's algorithm will do that.  Here, a minimum spanning
tree is one that minimizes the sum of all weights in the tree.

Now the depth-first search algorithm can be modified for digraphs.
The definitions for a few terms will be recalled.  A digraph is
asymmetric if there is at most one arc between any two vertices.  A
directed tree is an asymmetric digraph whose underlying graph (where
arcs become edges) is a tree.  A directed tree that contains exactly
one vertex with indegree 0 is called an out-tree.  A directed forest
is a digraph each of whose components is a directed tree.  A theorem
then states that an out-tree is a rooted-tree whose root is the unique
vertex with indegree 0.  Using an adjacency list for vertices, the
depth-first search produces a spanning directed forest each of whose
components is an out-tree. It is easy to draw the graphs of out-trees and then
to add in the remaining arcs of the digraph.

Once we have the out-trees forming components of the digraph, it is an
easy matter to embed each out-tree (as a relation on vertices) into
a linear order (say list the vertices by the distances from the root,
starting with the root (distance 0); vertices the same distances from
the root may be listed in any order.)

The bootstrap set of vertices are then the roots of the out-trees.
Note that this bootstrap set may change if the digraph grows as Axiom
adds more vertices. However, most of the original structure should
remain unless a very fundamental domain or category is created.

All the algorithms as well as concepts on digraphs may be found in:
Gary Chartrand and Ortrud R. Oellermann, "Applied and Algorithmic
Graph Theory", McGraw Hill, 1993, New York. The depth-first search for digraph
is Algorithm 11.1 on pp. 317-318.

William


root wrote:
> 
> With respect to the notion of optimizing the domain we choose for
> the bootstrap list:
> 
> Consider that we have, say 30 loops, each between 1 (some files
> self-refer) and 20 long. If we write each loop as a list then the goal
> is to choose domains out of the lists so we minimize the total number
> of bootstrap domains. Clearly the best first choice is "the most
> frequent" domain mentioned across all of the lists. We can then
> eliminate all of those lists since we now have a bootstrap file. Next
> we choose the most frequent domain from the remaining lists. This may
> not be optimal but it would be a better algorithm than the one I used
> which was to just choose any domain.
> 
> I'm sure there is a wonderful mathematical name for the bootstrap
> list (e.g. the "kernel"?) but I haven't yet cast it into either
> graph theory or any other theory yet.
> 
> Tim
> address@hidden
> address@hidden
> 
> _______________________________________________
> Axiom-developer mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/axiom-developer

- -- 
William Sit
Department of Mathematics..............Email: address@hidden
City College of New York..........................Tel: 212-650-5179
Convent Ave at West 138th Street..................Fax: 212-862-0004
New York, NY 10031.....Asian Symposium on Computer Mathematics 2003
USA..........................http://www.mmrc.iss.ac.cn/~ascm/ascm03
------- End of forwarded message -------




reply via email to

[Prev in Thread] Current Thread [Next in Thread]