[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Look for data serialisation format to implement communication betwee
From: |
Oleksandr Gavenko |
Subject: |
Re: Look for data serialisation format to implement communication between Emacs and external program. |
Date: |
Mon, 07 Jan 2013 15:53:57 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) |
On 2013-01-07, Helmut Eller wrote:
> On Sun, Jan 06 2013, Oleksandr Gavenko wrote:
>
>> Is that right to use ASN.1 BER as serialisation data format for communication
>> between Emacs and external program?
>
> S-expressions is the only format that Emacs can write and parse quickly
> because the printer and reader are implemented in C. This is likely 10
> times faster than any parser that you write in Emacs Lisp. The downside
> is that the external program needs to be able to do the same. Not such
> a bad tradeoff as S-expressions are fairly easy to parse.
>
> For communication with an external format I recommend a "framed" format:
> a frame is a fixed sized header followed by a variable length payload.
> The header describes the length of the frame. The length should be in
> bytes (not characters as counting characters in UTF8 strings is
> uneccessary complicated). Knowing the length of the frame is very
> useful because that makes it easy to wait for a complete frame. After
> you received a complete frame, parsing is simpler because you don't have
> to worry about incomplete input.
>
> I also recommend to limit the frame length to 24 bits (not 32 bit)
> because Emacs fixnums are limited to 29 bits on 32 bit machines.
>
> The payload can then be an S-expression printed with the Emacs prin1 and
> parsed back with the read function. The encoding of the payload can be
> utf-8. But use the Emacs 'binary coding system for communication with
> the external process and unibyte buffers for parsing. For the
> binary-to-utf8 conversion of the payload use something like
> decode-coding-string (which is written C and should be fast).
>
Seems that this is good solution in case of Emacs:
(assoc ':title (read "((:type blog-entry) (:title \"Hello\") (:article
\"world!\"))"))
Data validation:
(read ")") ;; ==> invalid-read-syntax
or when assoc return unknown ":type", etc...
Only things that annoying is escaping (like <div>hello</div> for
<div>hello</div> in XML or in SLIP protocol where 0x7e escaped by 0x7d 0x5e
and escape character 0x7d escaped by 0x7d 0x5d).
> If you like, you can also use extra bits in the header to indicate the
> format of the payload. E.g. it might be useful to have frames that
> contain only plain strings (not encoded as S-expr).
>
I start from using custom TLV data format but parsing and validation is hand
written so I decide as for suggestions...
--
Best regards!