emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Po Lu] Re: [Various] HAVE_FAST_UNALIGNED_ACCESS


From: Robert Pluim
Subject: [Po Lu] Re: [Various] HAVE_FAST_UNALIGNED_ACCESS
Date: Fri, 31 Mar 2023 11:43:19 +0200

--- Begin Message --- Subject: Re: [Various] HAVE_FAST_UNALIGNED_ACCESS Date: Fri, 31 Mar 2023 16:26:13 +0800
Thanks, Robert.  You also missed this one intended for Mattias:

From: Po Lu <luangruo@yahoo.com>
To: Mattias EngdegÄrd <mattiase@acm.org>
Cc: Robert Pluim <rpluim@gmail.com>,  emacs-devel@gnu.org
Subject: Re: HAVE_FAST_UNALIGNED_ACCESS
In-Reply-To: <87A3D742-5B8B-4F6B-805B-67AB8B4ECFF5@acm.org> ("Mattias
        EngdegÄrd"'s message of "Thu, 30 Mar 2023 12:28:30 +0200")
References: <87sfdmlgzx.fsf@gmail.com>
        <87A3D742-5B8B-4F6B-805B-67AB8B4ECFF5@acm.org>
X-Draft-From: ("nnimap+imap.mail.yahoo.com:Inbox" 259379)
Date: Thu, 30 Mar 2023 19:49:42 +0800
Message-ID: <877cuyv4q1.fsf@yahoo.com>
User-Agent: Gnus/5.13 (Gnus v5.13)

P.S. here's another example of what I mean, on x86_64:

int
alignment_test (char *c)
{
  register long i, *x;

  x = (long *) c;

  for (i = 0; i < 64; ++i)
    x[i] += 1;
}

with `gcc7 -O3', this generates:

alignment_test:
        movq    %rdi, %rax <------- x = x1
        xorl    %ecx, %ecx
        shrq    $3, %rax <--------- x1 /= 8
        andl    $1, %eax <--------- x1 &= 1
        je      .L2      <--------- if so, assume it is already 16 byte aligned
        addq    $1, (%rdi) <------- otherwise, it is only 8 byte aligned.  Add
                                    1 to the first long.
        movl    $1, %ecx
.L2:
        movl    $64, %r9d
        movdqa  .LC0(%rip), %xmm1 <----------- load vector of 1s
        subq    %rax, %r9
        leaq    (%rdi,%rax,8), %rsi <--------- x2 = x + (x1 * 8)
        xorl    %edx, %edx
        movq    %r9, %r8
        xorl    %eax, %eax          <--------- i = 0
        shrq    %r8
.L3:
        movdqa  (%rsi,%rax), %xmm0 <--------- Load 16 byte aligned x2[i]!!
        addq    $1, %rdx
        paddq   %xmm1, %xmm0
        movaps  %xmm0, (%rsi,%rax) <--------- Store 16 byte aligned x2[i]!!
        addq    $16, %rax          <--------- Next 16 bytes...
        cmpq    %r8, %rdx
        jb      .L3
        movq    %r9, %rdx
        andq    $-2, %rdx
        cmpq    %rdx, %r9
        leaq    (%rdx,%rcx), %rax
        je      .L4
        addq    $1, (%rdi,%rax,8)
.L4:
        rep ret
.LC0:
        .quad   1
        .quad   1

movqda traps if its first operand is not aligned to 16 bytes.  When
`alignment_test' is called without an 8 byte aligned argument, that is
what will happen.

--- End Message ---

Robert
-- 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]