[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Optimise memset on i386
From: |
Christian Franke |
Subject: |
Re: [PATCH] Optimise memset on i386 |
Date: |
Fri, 23 Jul 2010 19:34:51 +0200 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.10) Gecko/20100504 SeaMonkey/2.0.5 |
richardvoigt wrote:
might I suggest:
unsigned long patternl = pattern8;
patternl |= patternl << 8;
patternl |= patternl << 16;
patternl |= patternl << 32;
patternl |= patternl << 64;
O(lg N) instead of O(N), no loop, no branches, and the compiler should
be smart enough to optimize away the last two lines on systems with
narrower long.
The latter is unfortunately not the case. At least gcc 4.5.0 prints a
warning but still produces code.
$ cat <<EOF >f.c
unsigned long f(unsigned long x)
{
x |= x << 32;
x |= x << 64;
return x;
}
$ gcc -O3 -S f.c
x.c: In function âfâ:
x.c:3: warning: left shift count >= width of type
x.c:4: warning: left shift count >= width of type
$ cat f.s
...
pushl %ebp
movl $32, %ecx
movl %esp, %ebp
movl 8(%ebp), %eax
popl %ebp
movl %eax, %edx
sall %cl, %edx
movl $64, %ecx
orl %eax, %edx
movl %edx, %eax
sall %cl, %eax
orl %edx, %eax
ret
--
Regards,
Christian Franke