[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: neon: git for structured data [Was: Functional database]
From: |
amirouche |
Subject: |
Re: neon: git for structured data [Was: Functional database] |
Date: |
Wed, 21 Feb 2018 19:41:20 +0100 |
Héllo Roel,
Le mer. 21 févr. 2018 à 17:02, Roel Janssen <address@hidden> a écrit :
Dear Amirouche,
I'm not exactly sure if this fits in with your plans, but nevertheless
I'd like to share this code with you.
Thanks for the input.
I recently looked into using triple stores (actually quad stores)
and wrote an interface to Redland librdf for Guile.
Indeed quad stores. Triple store are only:
subject predicate object
whereas quad stores are:
graph subject predicate object
I did not grasp the difference between triple store and quad stores
until recently. see the definition of the w3c [0]
[0] https://www.w3.org/TR/rdf11-concepts/#section-rdf-graph
I somewhat looked at librdf before. In particular this is interesting:
Storage for graphs in memory and persistently with Oracle Berkeley
DB,
MySQL 3-5, PostgreSQL, OpenLink Virtoso, SQLite, files or URIs.
http://librdf.org/
This is definitely a feature that should be backed into neon.
By the way, wiredtiger is the successor of Oracle Berkley DB.
It was created by the same developers.
The difference between neon and librdf are the following:
- Quads can be version-ed in branches without copy (implemented but
on triples) making it effectively a quintuple store.
- You can pull / push graphs (called 'world' in librdf, i think)
ie. you can neon clone part of the remote data repository the
equivalent of git clone a particular directory (not implemented yet)
- The use of IRIs (or URIs) as 'graph name', 'subject' or 'predicate'
is not
enforced, this doesn't break compatibility with existing systems.
That said,
right now, I will implement 'object' as literals as the specification
describe
them [1] to allow compatibility with existing systems.
[1] https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal
Also, the API is I think simpler in neon:
I attached the source code of the interface.
With this interface, you can write something like this:
--8<---------------cut here---------------start------------->8---
(use-modules (redland rdf) ; The attached module.
(system foreign))
(define world (rdf-world-new))
(rdf-world-open world)
(define store (rdf-storage-new
world
"hashes"
"redland"
"new=true,hash-type='bdb',dir='path/to/triplestore'"))
(define model (rdf-model-new world store %null-pointer))
(define local-uri (rdf-uri-new world
"http://localhost:5000/Redland/"))
(define s (rdf-node-new-from-uri-local-name world local-uri "Test"))
(define p (rdf-node-new-from-uri-local-name world local-uri
"TestPredicate"))
(define o (rdf-node-new-from-uri-local-name world local-uri
"TestObject"))
(define statement (rdf-statement-new-from-nodes world s p o))
(rdf-model-add-statement model statement)
The equivalent of this in neon is basically:
(add context "Test" "TestPredicate" "TestObject")
Where 'context' is the database context somewhat equivalent to a
'cursor' in
postgresql parlance.
The strings are mapped to 64 bit unsigned integers in the underlying
storage
to save space and ease comparisons. subjects and predicates are each of
them
stored in specific tables which hot parts stay in RAM. It makes the
string
to integer resolution fast. Basically, I rely on the database layer to
cache
the integer value associated with subjects and predicates, for the time
being.
Similarly to retrieve a triple right now, it can be done as follow:
(ref context "Test" "TestPredicate")
It's a minor difference, and librdf API has the advantage of giving the
choice
to the user to do caching themself.
(rdf-statement-free statement)
(rdf-model-size model)
(rdf-storage-size store)
;; Example mime-type: application/rdf+xml
(define serializer (rdf-serializer-new world %null-pointer
"text/turtle" %null-pointer))
(define serialized (rdf-serializer-serialize-model-to-string
serializer local-uri model))
(format #t "Serialized: ~s~%" (pointer->string serialized))
There is no turtle support yet.
(rdf-uri-free local-uri)
(rdf-model-free model)
(rdf-storage-free store)
(rdf-world-free world)
--8<---------------cut here---------------end--------------->8---
Kind regards,
Roel Janssen
Thanks Roel!