[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 4/6] crypto: convert xts_tweak_encdec to use xts
From: |
Alberto Garcia |
Subject: |
Re: [Qemu-devel] [PATCH 4/6] crypto: convert xts_tweak_encdec to use xts_uint128 type |
Date: |
Thu, 11 Oct 2018 14:16:24 +0200 |
User-agent: |
Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu) |
On Tue 09 Oct 2018 05:31:32 PM CEST, Daniel P. Berrangé wrote:
> On Tue, Oct 09, 2018 at 05:30:25PM +0200, Alberto Garcia wrote:
>> >> > for (i = 0; i < lim; i++) {
>> >> > - xts_tweak_encdec(datactx, decfunc, src, dst, (uint8_t *)&T);
>> >> > + xts_uint128 S, D;
>> >> > +
>> >> > + memcpy(&S, src, XTS_BLOCK_SIZE);
>> >> > + xts_tweak_encdec(datactx, decfunc, &S, &D, &T);
>> >> > + memcpy(dst, &D, XTS_BLOCK_SIZE);
>> >>
>> >> Why do you need S and D?
>> >
>> > I think src & dst pointers can't be guaranteed to be aligned
>> > sufficiently for int64 operations, if we just cast from uint8t*.
>>
>> I see. I did a quick test without the memcpy() calls and it doesn't
>> seem to have a visible effect on performance, but if it turns out
>> that it does then maybe this is worth investigating further. I
>> suspect all buffers received by this code are allocated with
>> qemu_try_blockalign() anyway, so it should be safe.
>
> The extra memcpy() calls certainly had a perf impact when I added
> them, so if we can determine that we can safely do without, that would
> be desirable.
So I was having a look at this. From the block layer everything comes
properly aligned. Then there's VirtioCrypto, which seems to allow XTS
mode but I couldn't quite tell from virtio_crypto_sym_op_helper() if all
buffers are guaranteed to be aligned.
What you could do is a runtime check (with QEMU_PTR_IS_ALIGNED()) and
decide what implementation to use.
A couple of additional thoughts:
- x86_64 (and others) allow unaligned memory accesses, and that might be
faster than copying the buffer using memcpy(). I haven't measured it
however.
- qcrypto_block_{encrypt,decrypt}_helper() (used for encrypted block
I/O) use the same buffer for input and output, so maybe it's worth
exploring if this fact allows for additional optimization if you still
need to use memcpy().
Berto
- Re: [Qemu-devel] [PATCH 5/6] crypto: convert xts_mult_x to use xts_uint128 type, (continued)
[Qemu-devel] [PATCH 6/6] crypto: annotate xts_tweak_encdec as inlineable, Daniel P . Berrangé, 2018/10/09
[Qemu-devel] [PATCH 4/6] crypto: convert xts_tweak_encdec to use xts_uint128 type, Daniel P . Berrangé, 2018/10/09
Re: [Qemu-devel] [PATCH 0/6] crypto: improve performance of XTS cipher mode, Marc-André Lureau, 2018/10/09