[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
AW: [avr-gcc-list] optimizer / Compiler patch
From: |
Haase Bjoern (PT-BEU/MKP5) * |
Subject: |
AW: [avr-gcc-list] optimizer / Compiler patch |
Date: |
Thu, 25 Nov 2004 11:06:28 +0100 |
Hi,
I again have had a loook on your this specific case where the zero extension
uses more registers
than would be essential for the task. For this special case, it is fairly easy
to tell the
compiler to use a more efficient pattern, such that
uint32_t target_bit_variable;
void testUInt8_to_32 (void)
{ target32_bit_variable = returnUInt8function();
}
is compiled to
testUInt8_to_32:
rcall returnUInt8_function
sts target32_bit_variable,r24
sts (target32_bit_variable)+1,__zero_reg__
sts (target32_bit_variable)+2,__zero_reg__
sts (target32_bit_variable)+3,__zero_reg__
ret
instead of
testUInt8_to_32:
rcall returnUInt8_function
clr r25
clr r26
clr r27
sts target32_bit_variable,r24
sts (target32_bit_variable)+1,r25
sts (target32_bit_variable)+2,r26
sts (target32_bit_variable)+3,r27
ret
For this purpose it is simply necessary to add an instruction pattern to
"avr.md" of the type
(define_insn "*mov_MEMint32_REGuint8"
[(set (match_operand:SI 0 "memory_operand" "")
(zero_extend:SI(match_operand:QI 1 "register_operand" ""))
)
]
"CONSTANT_ADDRESS_P(XEXP(operands[0],0))"
"sts %A0,%A1
sts %B0,__zero_reg__
sts %C0,__zero_reg__
sts %D0,__zero_reg__"
[(set_attr "length" "4")]
)
I have attached a file containing also the required patterns also for the case
of sign extension
and for also 16 bit target variables (compare attached file).
A similar method possibly could work for other operations (additions, shifting,
...)
involving global variables that are so seldomly used that it is not useful to
hold them in registers.
If one would try to implement this in a similar way as in the above example,
this would, however,
imply that many many additional patterns for each special case would be
required in the machine description.
I would be willing to implement it. I, however, would appreciate a comment of a
more experienced
gcc expert on the proper way to do it (i.e. rather a huge "avr.md" or rather an
implementation whithin
"avr.c" ).
Yours,
Björn
P.S.:
BTW: In my own application, the pattern above did not show up one single time
;-). So it might
be justified to consider "target32_bit_variable = returnUInt8function();" to be
a fairly rare case.
-----Ursprüngliche Nachricht-----
Von: Ben Mann [mailto:address@hidden
Gesendet: Mittwoch, 24. November 2004 15:03
An: Haase Bjoern (PT-BEU/MKP5) *; 'Bernard Fouché';
address@hidden
Betreff: RE: [avr-gcc-list] optimizer
I can understand there's some challenge of making these sort of changes to
the RTL compiler. Nevertheless it seems that for embedded work this sort of
stuff is going to be quite important (speed and size always an issue...)
I realise this is not very helpful, but the best I could dream up so far was
a little macro to replace the compiler's casting:
//optimally cast a char to a long
#define CAST_CHAR2LONG(dest,src) \
*((char*)&(dest)) = (src); \
*((char*)&(dest)+1) = 0; \
*((char*)&(dest)+2) = 0; \
*((char*)&(dest)+3) = 0
long var;
...
CAST_CHAR2LONG(var,eeprom_read_byte((char *)ADDR));
//replaces var = eeprom_read_byte((char *)ADDR)
The code generated is (as you might imagine) substantially tighter and works
for local or global "var". However, the syntax sucks. I wonder if there's a
better way?
Ben Mann
-----Original Message-----
From: Haase Bjoern (PT-BEU/MKP5) * [mailto:address@hidden
Sent: Wednesday, 24 November 2004 8:56 PM
To: address@hidden; Bernard Fouché; address@hidden
Subject: AW: [avr-gcc-list] optimizer
Hi,
--snip!--
IMHO the possible benefit of a 32-> 4x8 splitting at the RTL level does not
really justify
the required amount of changes in the compiler.
Björn
-----Original Message-----
From: address@hidden [mailto:address@hidden
On Behalf Of Bernard Fouché
Sent: Wednesday, 24 November 2004 7:18 PM
To: address@hidden
Subject: [avr-gcc-list] optimizer
Hi.
I'm compiling with -Os for atmega64 with avr-gcc 3.4.2. When I have
uint32_t var;
var=(uint32_t)function_returning_an_int8_t();
the generated code is, for instance:
var=(uint32_t)eeprom_read_byte((uint8_t *)EEPROM_PARM);
ldi r24, 0x36 ; 54
ldi r25, 0x00 ; 0
call 0xf9c0
eor r25, r25
eor r26, r26
eor r27, r27
sts 0x046B, r24
sts 0x046C, r25
sts 0x046D, r26
sts 0x046E, r27
Could it be instead:
ldi r24, 0x36 ; 54
ldi r25, 0x00 ; 0
call 0xf9c0
sts 0x046B, r24
sts 0x046C, r1
sts 0x046D, r1
sts 0x046E, r1
That would spare 6 bytes...
Bernard
_______________________________________________
avr-gcc-list mailing list
address@hidden http://www.avr1.org/mailman/listinfo/avr-gcc-list
_______________________________________________
avr-gcc-list mailing list
address@hidden http://www.avr1.org/mailman/listinfo/avr-gcc-list
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- AW: [avr-gcc-list] optimizer / Compiler patch,
Haase Bjoern (PT-BEU/MKP5) * <=