[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [avr-libc-dev] Interested in 64-bit printf support?
From: |
George Spelvin |
Subject: |
Re: [avr-libc-dev] Interested in 64-bit printf support? |
Date: |
6 Dec 2016 12:14:29 -0500 |
> Again, we can safe code size by slightly slowing things down, e.g.
>
> mod5 (uint8_t x)
> {
> #if __AVR_ARCH__
> asm ("0: $ subi %0,%1 $ brcc 0b $ subi %0,%n1" : "+d" (x) : "n" (35));
> asm ("0: $ subi %0,%1 $ brcc 0b $ subi %0,%n1" : "+d" (x) : "n" (5));
> return x;
> #else
> ...
>
> The intermediate step via 35 is not essential, it's just a speed-up.
More detailed measurements...
The reduction loop is 3 instructions, and 3 + 3*loops cycles.
My code for reducing mod 15 is 7 instructions and 7 cycles:
mov __tmp_reg__,digit
swap __tmp_reg__
cbr digit,15
add digit,__tmp_reg__ /* Add high halves to get carry bit */
cbr digit,15
swap digit
adc digit,zero /* End-around carry */
So we have three code options:
1) Above code + mod-5 loop: 10 instructions, 17.835 cycles average
2) Mod-35 + mod-5 loops: 6 instructions, 24.282 cycles average
3) Mod-70 + mod-20 + mod-5 loops: 9 instructions, 20.718 cycles average
4) Mod-5 loop only: 3 instructions, 78.600 cycles average (ouch!)
The third option makes very little sense (I just wanted to measure it),
and the fourth is a little dear for my taste, but your suggestion costs
6.45 cycles per output digit, and saves 4 instructions.
Inspired by you, I saved one more instruction rather sneakily.
Rather than
clr lsbit
5: lsr lsbit
adc
rol lsbit
dec tlen
brne 5b
add lsbit,digit
add lsbit,digit
add lsbit,'0'
st out+,lsbit
I started with "ldi lsbit,'0'" and deleted the final add.
All the intermediate fiddling doesn't modify the high 7 bits of
the "lsbit" register, so I can load it right up front.
(I should think about renaming those variables.)
Re: [avr-libc-dev] Interested in 64-bit printf support?, Georg-Johann Lay, 2016/12/07