treebind
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Treebind] Support of RDF


From: Eric van der Vlist
Subject: [Treebind] Support of RDF
Date: Wed, 29 Jun 2005 12:24:32 +0200

Hi,

I am currently using TreeBind on a RDF/XML vocabulary.

Of course, a RDF/XML document is a well formed XML document and I could
use the current XML bindings to read and write RDF/XML document.

However, these bindings focus on the actual XML syntax used to serialize
the document. They don't see the RDF graph behind that syntax and are
sensitive to the "style" used in the XML document.

For instance, these two documents produce very similar triples:   

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by>
            <author>
                <fname>Eric</fname>
                <lname>van der Vlist</lname>
            </author>
        </written-by>
    </book>
</rdf:RDF>

and

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by rdf:resource="#vdv"/>
    </book>
    <author rdf:ID="vdv">
        <fname>Eric</fname>
        <lname>van der Vlist</lname>
    </author>
</rdf:RDF>

but the XML bindings will generate a quite different set of objects.

The solution to this problem is to create RDF bindings that will sit on
top of a RDF parser to pour the content of the RDF model into a set of
objects.

The overall architecture of TreeBind has been designed with this kind of
extension in mind and that should be easy enough.

That being said, design decisions need to be made to define these RDF
bindings and I'd like to discuss them in this forum.

RDF/XML isn't so much an XML vocabulary in the common meaning of this
term but rather a set of binding rules to bind an XML tree into a graph.

These binding rules introduce some conventions that are sometimes
different from what we use to do in "raw" XML documents.  

In raw XML, we would probably have written the previous example as:

<?xml version="1.0" encoding="UTF-8"?>
<book xmlns="http://ns.treebind.org/example/";>
    <title>RELAX NG</title>
    <author>
        <fname>Eric</fname>
        <lname>van der Vlist</lname>
    </author>
</book>

The XML bindings would pour that content into a set of objects using the
following algorithm:

      * find a class that matches the XML expanded name
        {http://ns.treebind.org/example/}book and create an object from
        that class.
      * try to find a method such as addTitle or setTitle with a string
        parameter on this book object and call that method with the
        string "RELAX NG".
      * find a class that matches the XML expanded name
        {http://ns.treebind.org/example/}author and create an object
        from that class.
      * try to find a method such as addFname or setFname with a string
        parameter on this author object and call that method with the
        string "Eric".
      * try to find a method such as addLname or setLname with a string
        parameter on this author object and call that method with the
        string "van der Vlist".
      * try to find a method such as addAuthor or setAuthor with a
        string parameter on the book object and call that method with
        the author object.

We see that there is a difference between the way simple type and
complex type elements are treated.

For a simple type element (such as "title", "fname" and "lname"), the
name of the element is used to determine the method to call and the
parameter type is always string.

For a complex type element (such as author), the name of the element is
used both to determine the method to call and the class of the object
that needs to be created. The parameter type is this class.

This is because when we write in XML <book><author/></book> there is an
implicit expectation that "author" is used both as a complex object and
as a verb.

Unless instructed otherwise, RDF doesn't allow these implicit shortcuts
and an XML element is either a predicate or an object. That's why we
have added an "written-by" element in our RDF example:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by>
            <author>
                <fname>Eric</fname>
                <lname>van der Vlist</lname>
            </author>
        </written-by>
    </book>
</rdf:RDF>

The first design decision we have to make is to decide how we will treat
that "written-by" element.

To have every thing in hand to take a decision, let's also see what are
the triples for that example:

rapper: Parsing file book1.rdf
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://ns.treebind.org/example/book> .
_:genid1 <http://ns.treebind.org/example/title> "RELAX NG" .
_:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://ns.treebind.org/example/author> .
_:genid2 <http://ns.treebind.org/example/fname> "Eric" .
_:genid2 <http://ns.treebind.org/example/lname> "van der Vlist" .
_:genid1 <http://ns.treebind.org/example/written-by> _:genid2 .
rapper: Parsing returned 6 statements

In these triples, two of them are defining element types:

_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://ns.treebind.org/example/book> .
and
_:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://ns.treebind.org/example/author> .

I propose to use these statements to determine which classes must be
used to create the objects. So far, that's pretty similar to what we're
doing in XML.

Then, we have triples that assign literals to our objects:

_:genid1 <http://ns.treebind.org/example/title> "RELAX NG" .
_:genid2 <http://ns.treebind.org/example/fname> "Eric" .
_:genid2 <http://ns.treebind.org/example/lname> "van der Vlist" .

We can use the predicates of these triples
(<http://ns.treebind.org/example/title>,
<http://ns.treebind.org/example/fname>,
<http://ns.treebind.org/example/lname>) to determine the names of the
setter methods to use to add the corresponding information to the
object. Again, that's exactly similar to what we're doing in XML.

Finally, we have a statement that links two objects together:

_:genid1 <http://ns.treebind.org/example/written-by> _:genid2 .

I think that it is quite natural to use the predicate
(<http://ns.treebind.org/example/written-by>) to determine the setter
method that needs to be called on the book object to set the author
object.

This is different from what we would have been doing in XML: in XML,
since there is a written-by element, we would have created a
"written-by" object, added the author object to the written-by object
and added the written-by object to the book object.

Does that difference make sense?

I think it does, but the downside is that the same simple document like
this one:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <written-by>
            <author>
                <fname>Eric</fname>
                <lname>van der Vlist</lname>
            </author>
        </written-by>
    </book>
</rdf:RDF>

will give a quite different set of objects depending which binding (XML
or RDF) will be used.

That seems to be the price to pay to try to get as close as possible to
the RDF model.

What do you think?

Earlier, I have mentioned that RDF can be told to accept documents with
"shortcuts". What I had in mind is:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns="http://ns.treebind.org/example/";>
    <book>
        <title>RELAX NG</title>
        <author rdf:parseType="Resource">
            <fname>Eric</fname>
            <lname>van der Vlist</lname>
        </author>
    </book>
</rdf:RDF>

Here, we have used an attribute rdf:parseType="Resource" to specify that
the author element is a resource.

The triples generated from this document are:

rapper: Parsing file book3.rdf
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://ns.treebind.org/example/book> .
_:genid1 <http://ns.treebind.org/example/title> "RELAX NG" .
_:genid2 <http://ns.treebind.org/example/fname> "Eric" .
_:genid2 <http://ns.treebind.org/example/lname> "van der Vlist" .
_:genid1 <http://ns.treebind.org/example/author> _:genid2 .
rapper: Parsing returned 5 statements

The model is pretty similar except that there is a triple missing (we
have now 5 triples instead of 6).

The triple that is missing is the one that gave the type of the author
element:

_:genid2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://ns.treebind.org/example/author> .

The other difference is that <http://ns.treebind.org/example/author> is
now a predicate.

In this situation when we don't have a type for a predicate linking to a
resource, I propose that we follow the rule we use in XML and use the
predicate to determine both the setter method and the class of the
object to create to pour the resource.

What do you think? Does that make sense?

Thanks,

Eric

-- 
Le premier annuaire des apiculteurs 100% XML!
                                                http://apiculteurs.info/
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(ISO) RELAX NG   ISBN:0-596-00421-4 http://oreilly.com/catalog/relax
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------





reply via email to

[Prev in Thread] Current Thread [Next in Thread]