qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-s390x] [Qemu-devel] [PATCH v1 03/33] s390x: Add one temporary


From: Richard Henderson
Subject: Re: [qemu-s390x] [Qemu-devel] [PATCH v1 03/33] s390x: Add one temporary vector register in CPU state for TCG
Date: Tue, 26 Feb 2019 10:36:18 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

On 2/26/19 3:38 AM, David Hildenbrand wrote:
> We sometimes want to work on a temporary vector register instead of the
> actual destination, because source and destination might overlap. An
> alternative would be loading the vector into two i64 variables, but than
> separate handling for accessing the vector elements would be needed.
> This is easier. Add one for now as that seems to be enough.

Hmm, I'll reserve judgment until I see how this is used.

For ARM SVE, I would allocate this temporary on the stack within the helper,
and move one of the operands out of the way.  E.g.

void helper(foo)(void *vd, void *vx, *void *vy
{
    VectorReg tmp;
    TYPE *d = vd, *x = vx, *y = vy;

    if (vx == vd || vy == vd) {
        tmp = *(VectorReg *)vd;
        if (vx == vd) {
            vx = &tmp;
        }
        if (vy == vd) {
            vy = &tmp;
        }
    }

    process d, x, y as normal.
}

This minimized the amount of code inline.  However, SVE vectors are quite a bit
larger, at 256 bytes, so the copy itself was out of line most of the time 
anyway.

Provisionally,
Reviewed-by: Richard Henderson <address@hidden>


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]