emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: libnettle/libhogweed WIP


From: Ted Zlatanov
Subject: Re: libnettle/libhogweed WIP
Date: Mon, 15 May 2017 17:55:34 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)

On Fri, 21 Apr 2017 09:14:02 +0300 Eli Zaretskii <address@hidden> wrote: 

EZ> Something like this at the beginning of the node:
>> 
EZ> All of the functions described here accept argument lists of the
EZ> form @code{(BUFFER-OR-STRING START END CODING-SYSTEM NOERROR)},
EZ> where BUFFER-OR-STRING ...
>> 
>> OK, but I want to link to that from the C function docs too, since they
>> also will reference that format. Would I link to the manual node? Is
>> that OK for C functions, which are kind of standalone as far as docs go?

EZ> If you mean primitives implemented in C that are exposed to Lisp,
EZ> yes.  For internal C functions inaccessible from Lisp, just describe
EZ> that in a comment preceding the functions.

OK, I'll add those links instead of the full text.

EZ> I think both this and the other sub-thread arrived at a point where it
EZ> is important to present a list of use cases we envision and would like
EZ> to support, because these kinds of decisions are prone to errors if we
EZ> don't hold the use cases in our minds.

EZ> So could you please present such a list, describing the source of the
EZ> text to be encrypted/decrypted/hashed, the purpose of the operation
EZ> (i.e. some higher-level context), and the destination where the
EZ> encrypted/decrypted/hashed text will go?  The list doesn't have to be
EZ> exhaustive, but it should include the most important use cases.

Right now, I am mirroring the GnuTLS API. That API is in wide use. So I
think the use cases are a large subset of the general GnuTLS use cases;
we're enabling the same things.

On Fri, 21 Apr 2017 09:21:14 +0300 Eli Zaretskii <address@hidden> wrote: 

>> From: Ted Zlatanov <address@hidden>
>> Date: Thu, 20 Apr 2017 17:54:32 -0400
>> 
>> The KEY is secret and ideally would come from a file and never be
>> seen at the Lisp level. But tests and other use cases may need it from a
>> buffer (more secure but still accessible to Lisp) or a string (visible
>> to all as a function parameter).

EZ> For testing, we could always write the key to a file before using it.
EZ> What other use cases would need the key from other sources?

I think if the key is not on disk, we shouldn't force it to disk just to
fit a always-a-file usage model. OTOH if it's already on disk, we
shouldn't slurp it into memory to fit the always-in-memory usage model.
Both of those situations expose the key data by making copies.

>> Getting the INPUT from a file enables large files (not in the first
>> version probably) and other interesting use cases.

EZ> What other cases?  Large files is only theoretically useful, since
EZ> generally Emacs cannot do useful things on files larger than
EZ> most-positive-fixnum, and on 64-bit machines that is far enough to not
EZ> care.

I think anything over 1 MB is pretty big. There also pipes (pretty easy
to fit the file model) and network streams (probably a separate spec).

EZ> I think we need to weigh flexibility against the complexity, and find
EZ> the optimal balance.  So making the interfaces too complicated for use
EZ> cases that will happen only very rarely, if at all, should be avoided.

I agree in general, BUT these are not end-user APIs. They are for
application developers. They have to be flexible so we don't have to
bolt-on these things later. Users will get something much simpler or
they won't even see these interfaces (the application will hide them).

On Fri, 21 Apr 2017 20:45:58 +0200 Lars Ingebrigtsen <address@hidden> wrote: 

LI> Ted Zlatanov <address@hidden> writes:

LI> Hm...  Having a file that just has a passphrase in it sounds like an
LI> unusual use case.  I think in Emacs these tokens would normally come
LI> from auth-source in most applications.  At least that what I see when I
LI> salivate at use cases.  :-)

Private SSH keys are a good example; see
https://github.com/jschauma/jass for instance. But generally as I said
above, we shouldn't force a copy file->string or string->file if the
private data is already in one form.

LI> Emacs buffers are surprisingly efficient at handling large files:
LI> They're basically just (sort of) contiguous areas of memory with some
LI> structs describing their contents.

OK, but buffers are a copy of the file data. I'd rather not make a copy.

LI> If I understand the code correctly (and I may definitely not be doing
LI> that; I've just skimmed it very, very briefly), you may be able to point
LI> the encryption code at the Emacs buffer contents directly without
LI> copying it anywhere beforehand, and then (since the results are usually
LI> of very similar length) back to the same Emacs buffer afterwards.

LI> 4GB Emacs buffer -> encrypted to 4GB GnuTLS buffer -> 4GB Emacs buffer

LI> instead of

LI> 4GB Emacs buffer -> copy to 4GB gnutls.c buffer -> encrypted to 4GB
LI> GnuTLS buffer -> made into Emacs string or something

Yes, definitely possible. But it's more secure, I think, to read chunks
from the file and process them (possibly overwriting the data) in a
small loop, narrowing the scope and risk of the data exposure. The
GnuTLS APIs are designed for that usage. It's faster but less
interruptible as well. So it's not ideal for every situation, but I
would like to support it.

On Fri, 21 Apr 2017 22:15:16 +0300 Eli Zaretskii <address@hidden> wrote: 

EZ> The data will always leave traces, because doing the above involves
EZ> reallocation of memory, so you are likely to leave traces in the page
EZ> file and in memory.  But I don't think you can avoid that, whatever
EZ> you do: as long as data needs to be read into memory to process it, it
EZ> will always leave traces.

Would you agree the tight loop that overwrites the read block will leave
fewer traces and offer fewer exposure opportunities?

LI> The other problem with having a special file handler in the GnuTLS code
LI> is that users will expect to be able to encrypt all files that they see
LI> visible from Emacs, including the ones from Tramp, and application
LI> writers will also have differing opinions on whether encrypting a .gz
LI> file means encrypting the contents of the file or the file itself: That
LI> is, Emacs has a very rich file handler jungle that it would be nice if
LI> still works when you ask Emacs to encrypt something.

LI> You'd have to handle

LI> (file "~/foo)
LI> (file "c:/foo/bar")
LI> (file "Héllo") ; in iso-8859-1
LI> (file "/ssh:host:/tmp/foo")

Right, I understand. I am OK with restricting the API to local files
only but agree the name handling needs to be done carefully.

Lars, I think your read-into-buffer macro would work nicely in a wrapper
API (something like EPA's contexts). We can modify the macro if the
`(file "foo")' spec becomes available, and end users won't know the
difference.

So how about a compromise for now: I can leave the `(file "foo")'
capability out. I'll adjust the docs to remove mentions of it. Then,
after the main patch is done, I can propose a followup patch to
implement `(file "foo")' and we can decide if it's good or bad.

That will get the GnuTLS API integration working, and we can have a
separate discussion about `(file "foo")' later. Lars, Eli, would that be
acceptable?

Thanks!
Ted




reply via email to

[Prev in Thread] Current Thread [Next in Thread]