avr-libc-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] User-manual/optimization.html


From: Georg-Johann Lay
Subject: Re: [avr-libc-dev] User-manual/optimization.html
Date: Fri, 19 Jun 2015 17:13:51 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

Am 06/18/2015 um 02:58 PM schrieb David Brown:
Hi,

In the user manual:

<http://www.nongnu.org/avr-libc/user-manual/optimization.html>

there is a discussion about the unexpected code generation from:

#define cli() __asm volatile( "cli" ::: "memory" )
#define sei() __asm volatile( "sei" ::: "memory" )
unsigned int ivar;
void test2( unsigned int val )
{
        val = 65535U / val;
        cli();
        ivar = val;
        sei();
}

This came up recently in a gcc-help mailing list question - the problem
is that the call to __udivmodhi4 may be generated after the cli
instruction, disabling interrupts for longer than necessary.  The web
page says there is no way to force the desired code generation (with
"val" being calculated before "cli").

If my recollection is right -fno-tree-ter was a fix as the code motion was performed by respective pass.

Some technical background: The avr back-end pretends it implements integer division and remainder by providing respective insns, hence the middle-end assumes that the division can be performed with a few instructions.

Rationale is that avr-libgcc has many hand-written and -optimized assembler routines, and many of these routines have a smaller register footprint than required by the ABI. avr-gcc uses this information to implement respective features (like div) as a transparent library call together with clobbering all destroyed registers and providing arguments to respective registers by hand.

This results in much smaller code, and many functions become leaf functions. Without that approach any function using a feature as basic as integer multiplication would generate "proper" library calls similar to ordinary functions.

If division was a library call it wouldn't be moved across the memory clobber, but the result would considerably increase in code size.


However, there /is/ a way to get the right results - using a fake
assembly input to force the calculation:

#define cli() __asm volatile( "cli" ::: "memory" )
#define sei() __asm volatile( "sei" ::: "memory" )
unsigned int ivar;
void test2( unsigned int val )
{
     val = 65535U / val;
     asm volatile("" :: "" (val));
     cli();
     ivar = val;
     sei();
}

The memory clobber on cli() and sei() ensures that no memory operations
are moved before or after those statements.  But as already noted, the
memory clobber does not affect non-memory operations such as
calculations or register-only manipulation.

The problem is that one has to know respective dependencies which is usually not the case. Just consider the case where the cli() is part of an inlined function and the division or multiplication is performed by the caller. or the multiplication is part of an address computation like in val = list->next->next->next->val.


My recommendation is to try -fno-tree-ter before cluttering up code with ugly patterns.

Johann




reply via email to

[Prev in Thread] Current Thread [Next in Thread]