emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Native compilation - specific optimisation surely possible?


From: Andrea Corallo
Subject: Re: Native compilation - specific optimisation surely possible?
Date: Sun, 02 Jan 2022 22:27:37 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Alan Mackenzie <acm@muc.de> writes:

> Hello, Emacs.
>
> The following very short function:
>
>
> ;; -*- lexical-binding: t -*-
> (defun comp-test-55 (x)
>   (unless (integerp x)
>     x))
>
>
> byte compiles to:
>
>
> byte code for comp-test-55:
>   doc:   ...
>     args: (arg1)
>     0       dup
>     1       integerp
>     2       not
>     3       goto-if-nil-else-pop 1
>     6       dup
>     7:1     return
>
>
> , then on an amd-64 machine, native compiles to (annotation added by
> me):
>
>
>
> 00000000000012c0 <F636f6d702d746573742d3535_comp_test_55_0>:
> Setup of the function:
>     12c0:     55                      push   %rbp
>     12c1:     53                      push   %rbx
>     12c2:     48 89 fb                mov    %rdi,%rbx
>     12c5:     48 83 ec 08             sub    $0x8,%rsp
>     12c9:     48 8b 05 18 2d 00 00    mov    0x2d18(%rip),%rax        # 3fe8 
> <freloc_link_table@@Base-0x240>
>     12d0:     48 8b 28                mov    (%rax),%rbp
> fixnump:
>     12d3:     8d 47 fe                lea    -0x2(%rdi),%eax
>     12d6:     a8 03                   test   $0x3,%al
>     12d8:     75 26                   jne    1300 
> <F636f6d702d746573742d3535_comp_test_55_0+0x40>
>
>     12da:     48 8b 05 ff 2c 00 00    mov    0x2cff(%rip),%rax        # 3fe0 
> <d_reloc@@Base-0x220>
>     12e1:     48 8b 78 10             mov    0x10(%rax),%rdi
> Nil in %rdi?:
>     12e5:     31 f6                   xor    %esi,%esi
>     12e7:     ff 95 c0 27 00 00       call   *0x27c0(%rbp)      `eq' 
> <========================
>     12ed:     48 85 c0                test   %rax,%rax
>     12f0:     48 0f 45 c3             cmovne %rbx,%rax
> Tear down of the function:
>     12f4:     48 83 c4 08             add    $0x8,%rsp
>     12f8:     5b                      pop    %rbx
>     12f9:     5d                      pop    %rbp
>     12fa:     c3                      ret    
>     12fb:     0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> bignump:
>     1300:     8d 47 fb                lea    -0x5(%rdi),%eax
>     1303:     a8 07                   test   $0x7,%al
>     1305:     74 09                   je     1310 
> <F636f6d702d746573742d3535_comp_test_55_0+0x50>
>
>     1307:     31 ff                   xor    %edi,%edi
>     1309:     eb da                   jmp    12e5 
> <F636f6d702d746573742d3535_comp_test_55_0+0x25>
>     130b:     0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> pseudovectorp:
>     1310:     be 02 00 00 00          mov    $0x2,%esi
>     1315:     ff 55 08                call   *0x8(%rbp)
>     1318:     84 c0                   test   %al,%al
>     131a:     75 be                   jne    12da 
> <F636f6d702d746573742d3535_comp_test_55_0+0x1a>
>     131c:     31 ff                   xor    %edi,%edi
>     131e:     eb c5                   jmp    12e5 
> <F636f6d702d746573742d3535_comp_test_55_0+0x25>
>
> ..  The input parameter x (or arg1) is passed into the function in the
> register %rdi.  integerp is coded successively as fixnump followed (if
> necessary) by bignump.  The fixnump is coded beautifully in three
> instructions.
>
> I don't understand what's happening at 12da.  It seems that the address
> of a stack pointer is being loaded into %rax, from which the result of
> `fixnump' (which was already in %rax) is loaded into %rdi.  
>
> But my main point is the compilation of the `not' instruction at 12e5.
> The operand to `not' is in %rdi.  It is coded up as (eq %rdi nil) by
> loading 0 (nil) into %rsi at 12e5, then making a function call to `eq'
> at 12e7.
>
> Surely the overhead of the function call for `eq' makes this a candidate
> for optimisation?  `not' could be coded up in two instructions (test
> %rdi,%rdi followed by a conditional jump or (faster) the cmovne which is
> %already there).
>
> `not' is presumably a common opcode in byte compiled functions.  `eq'
> surely more so.  So why are we coding these up as function calls?
>
> Andrea?

Hi Alan,

could you attach the .c file produced with `native-comp-debug' >= 2?

Thanks

  Andrea

PS I might be a little slow answering mails for the coming week as I'm
on holiday :)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]