avr-libc-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[avr-libc-dev] ... Some (?funny?) experimental results when removing par


From: Björn Haase
Subject: [avr-libc-dev] ... Some (?funny?) experimental results when removing part of the monolithic SI/HI instruction patterns :-) ...
Date: Mon, 7 Mar 2005 23:02:20 +0100
User-agent: KMail/1.7.1

Hi,

I'd like to report on some experiments on the AVR back-end that lead to some 
results, that I think are sufficiently interresting and sufficiently funny to 
be shared and discussed. This mail is not an urgent request for support, so 
continue reading only, If you have time and if you are willing and prepared 
to see weird things :-) : 

1.) Background story: In order to start with splitting the instruction 
patterns, I followed Denis' advice and initially chose the logical operations 
(and/ior, etc). For this purpose, I first have taken head 4.1.0 and cleaned 
up the machine description by removed all of the SI and HI mode patterns for 
ori, since I planned to remove them by define_insn_and_split in the next 
step. 

2.) Changes in the machine description: Concerning ior, the only remaining 
pattern in the .md now is

(define_insn "iorqi3"
  [(set (match_operand:QI 0 "register_operand" "=r,d,r")
        (ior:QI (match_operand:QI 1 "register_operand" "%0,0,0")
                (match_operand:QI 2 "nonmemory_operand" "r,i,L")))]
  ""
  "@
        or %0,%2
        ori %0,lo8(%2)
        ; ori with 0"
  [(set_attr "length" "1,1,0")
   (set_attr "cc" "set_zn,set_zn,none")])
. 
Moreover, I now have defined zero extension by use of a define_expand instead 
of define_insn.

(define_expand "zero_extendhisi2"
  [(set (subreg:HI (match_operand:SI 0 "register_operand" "=r") 0)
        (match_operand:HI 1 "register_operand" "r"))
   (set (subreg:QI (match_dup 0) 2)
        (const_int 0))
   (set (subreg:QI (match_dup 0) 3)
        (const_int 0))]
  ""
  ""
)

3.) Test-function
. I have then compiled the following code with and without the changes with 
using -Os. 


#include <stdint.h>
#include <avr/io.h>

uint32_t 
foo (uint32_t a, uint16_t b)
{ a |= b;
  a |= 255L*256L+16+8*256L*65536L;
  return a;
}


What comes out is the following:

Results with present head 4.1.0 with ior patterns for QI,HI and SI defined in 
the machine description and standard zero-extension

.global foo
        .type   foo, @function
foo:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        movw r26,r24
        movw r24,r22
        clr r22
        clr r23
        or r24,r20
        or r25,r21
        or r26,r22
        or r27,r23
        ori r24,lo8(134283024)
        ori r25,hi8(134283024)
        ori r27,hhi8(134283024)
        movw r22,r24
        movw r24,r26
/* epilogue: frame size=0 */
        ret

Namely te code makes use of the usual sign-extension mechanism and implements 
two SI-mode or operations.

Results with present head 4.1.0 with ior patterns after deleting all HI and SI 
patterns in the machine description and with defining zero_extend to be a 
define_expand:

foo:
/* prologue: frame size=0 */
        push r16
        push r17
/* prologue end (size=2) */
        ldi r18,lo8(0)
        or r22,r20
        or r24,r18
        ori r22,lo8(16)
        ldi r23,lo8(-1)
        ori r25,lo8(8)
/* epilogue: frame size=0 */
        pop r17
        pop r16
        ret

Here, the code is kind of weird. It is visibly tighter, but ... 
First nice thing is that gcc realizes that r23 anyway will be filled with 0xff 
and that it, thus is useless to perform the first or operation on it. Also 
the sign extension is almost completely optimized away, except for a remnant 
QI single mode register r18 that actually is never used. I do not quite 
understand how this r18 expressions come out . :-) ... And the third weird 
thing is, that seemingly gcc completely looses track of the registers in use 
when push/popping r16/r17 ...

The first conclusion I am drawing is:

1.) Removing some of the patterns does not necessarily result in worse code.
2.) There seems to persist a problem in the mechanism that keeps track of the 
used and unused registers. I doubt, whether this problem is also present for 
the usual back-end *with* the SI/HI mode patterns.

I am just running the test suite with this change and so far I have observed 
only a small amount of regressions seemingly due to problems with 
"simplify_subregs".

Since I am not too familiar with the history of the back-end:

Is there a specific original reason why it was neccessary to define the SI and 
HI mode patterns for the logic operations (I do understand that it was 
necessary to define plus and minus but why this also for and/or/xor?)

Yours,

Björn




reply via email to

[Prev in Thread] Current Thread [Next in Thread]