qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] 1639a9: target/nios2: Fix 64-bit ilp32 compil


From: GitHub
Subject: [Qemu-commits] [qemu/qemu] 1639a9: target/nios2: Fix 64-bit ilp32 compilation
Date: Tue, 06 Jun 2017 01:56:05 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 1639a965d30b45b10134b69bf49dd3e657d2ef09
      
https://github.com/qemu/qemu/commit/1639a965d30b45b10134b69bf49dd3e657d2ef09
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/nios2/translate.c

  Log Message:
  -----------
  target/nios2: Fix 64-bit ilp32 compilation

Avoid a "cast from pointer to integer of different size" warning
by using the proper host type.

Reviewed-by: Philippe Mathieu-Daudé <address@hidden>
Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: f1079bb8f9a28495937408acc60a681a7b2536a8
      
https://github.com/qemu/qemu/commit/f1079bb8f9a28495937408acc60a681a7b2536a8
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M configure

  Log Message:
  -----------
  tcg/sparc: Use the proper compilation flags for 32-bit

We have required a v9 cpu since 9b9c37c36439ee0452632253dac7a31897f27f70.
However, the flags we were using did not reliably enable v8plus, which
meant that the compiler didn't know it could inline 64-bit atomics.

Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 374aae653499f4d405caf32b7fff0c8639113fe4
      
https://github.com/qemu/qemu/commit/374aae653499f4d405caf32b7fff0c8639113fe4
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M include/qemu/atomic.h

  Log Message:
  -----------
  qemu/atomic: Loosen restrictions for 64-bit ILP32 hosts

We need to coordinate with the TCG_OVERSIZED_GUEST test in cputlb.c,
and allow 64-bit atomics even though sizeof(void *) == 4.

Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: cedbcb01529cb6cf9a2289cdbebbc63f6149fc18
      
https://github.com/qemu/qemu/commit/cedbcb01529cb6cf9a2289cdbebbc63f6149fc18
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M cpu-exec.c
    M include/exec/exec-all.h
    M tcg-runtime.c
    M tcg/README
    M tcg/aarch64/tcg-target.h
    M tcg/arm/tcg-target.h
    M tcg/i386/tcg-target.h
    M tcg/ia64/tcg-target.h
    M tcg/mips/tcg-target.h
    M tcg/ppc/tcg-target.h
    M tcg/s390/tcg-target.h
    M tcg/sparc/tcg-target.h
    M tcg/tcg-op.c
    M tcg/tcg-op.h
    M tcg/tcg-opc.h
    M tcg/tcg-runtime.h
    M tcg/tcg.c
    M tcg/tcg.h
    M tcg/tci/tcg-target.h

  Log Message:
  -----------
  tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr

Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.

Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Message-Id: <address@hidden>
Message-Id: <address@hidden>
Message-Id: <address@hidden>
[rth: Squashed 4 related commits.]
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 5cb4ef80f65252dd85b86fa7f3c985015423d670
      
https://github.com/qemu/qemu/commit/5cb4ef80f65252dd85b86fa7f3c985015423d670
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/i386/tcg-target.h
    M tcg/i386/tcg-target.inc.c

  Log Message:
  -----------
  tcg/i386: implement goto_ptr

Suggested-by: Richard Henderson <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
[rth: Reuse goto_ptr epilogue for exit_tb 0.]
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 7ad55b4ffd982c80f26f7f3658138d94cdc678e8
      
https://github.com/qemu/qemu/commit/7ad55b4ffd982c80f26f7f3658138d94cdc678e8
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/arm/translate.c

  Log Message:
  -----------
  target/arm: optimize cross-page direct jumps in softmmu

Instead of unconditionally exiting to the exec loop, use the
lookup_and_goto_ptr helper to jump to the target if it is valid.

Perf impact: see next commit's log.

Reviewed-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 8a6b28c7b5104263344508df0f4bce97f22cfcaf
      
https://github.com/qemu/qemu/commit/8a6b28c7b5104263344508df0f4bce97f22cfcaf
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/arm/translate.c
    M target/arm/translate.h

  Log Message:
  -----------
  target/arm: optimize indirect branches

Speed up indirect branches by jumping to the target if it is valid.

Softmmu measurements (see later commit for user-mode results):

Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.

- Impact on Boot time

| setup  | ARM debian jessie boot+shutdown time | stddev |
|--------+--------------------------------------+--------|
| v2.9.0 |                                 8.84 |   0.07 |
| +cross |                                 8.85 |   0.03 |
| +jr    |                                 8.83 |   0.06 |

-                            NBench, arm-softmmu (debian jessie guest). Host: 
Intel i7-4790K @ 4.00GHz

  1.3x 
+-+-------------------------------------------------------------------------------------------------------------+-+
       |                                                                        
                                         |
       |   cross                                                          ####  
                                         |
 1.25x 
+cross+jr..........................................................#++#.........................................+-+
       |                                                        ####      #  #  
                                         |
       |                                                     +++#  #      #  #  
                                         |
       |                                      +++            ****  #      #  #  
                                         |
  1.2x 
+-+...................................####............*..*..#......#..#.........................................+-+
       |                                  ****  #            *  *  #      #  #  
   ####                                  |
       |                                  *  *  #            *  *  #      #  #  
   #  #                                  |
 1.15x 
+-+................................*..*..#............*..*..#......#..#.....#..#................................+-+
       |                                  *  *  #            *  *  #      #  #  
   #  #                                  |
       |                                  *  *  #      ####  *  *  #      #  #  
   #  #                                  |
       |                                  *  *  #      #  #  *  *  #      #  #  
   #  #                         ####     |
  1.1x 
+-+................................*..*..#......#..#..*..*..#......#..#.....#..#.........................#..#...+-+
       |                                  *  *  #      #  #  *  *  #      #  #  
   #  #                         #  #     |
       |                                  *  *  #      #  #  *  *  #      #  #  
   #  #                         #  #     |
 1.05x 
+-+..........................####..*..*..#......#..#..*..*..#......#..#.....#..#......+++............*****..#...+-+
       |                        *****  #  *  *  #      #  #  *  *  #  *****  #  
   #  #   +++ |    ****###  *   *  #     |
       |                        *+++*  #  *  *  #      #  #  *  *  #  *+++*  #  
****  #  *****###  *  *  #  *   *  #     |
       |     *****###  +++####  *   *  #  *  *  #  *****  #  *  *  #  *   *  #  
*  *  #  * | *++#  *  *  #  *   *  #     |
    1x 
+-++-+*+++*-+#++****++#++*+-+*++#+-*++*++#-+*+++*-+#++*++*++#++*+-+*++#+-*++*++#-+*+++*-+#++*++*++#++*+-+*++#+-++-+
       |     *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
       |     *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
 0.95x 
+-+---*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###---+-+
       ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONEURAL 
NNUMERIC SOSTRING SORT     hmean
  png: http://imgur.com/eOLmZNR

NB. 'cross' represents the previous commit.

Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
[rth: Replace gen_jr global variable with DISAS_EXIT state.]
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 1ebb1af1b8068fca36f48f738eb7146ecdf03625
      
https://github.com/qemu/qemu/commit/1ebb1af1b8068fca36f48f738eb7146ecdf03625
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/i386/translate.c

  Log Message:
  -----------
  target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr

This helper will be used by subsequent changes.

Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: fe62089563ffc6a42f16ff28a6b6be34d2697766
      
https://github.com/qemu/qemu/commit/fe62089563ffc6a42f16ff28a6b6be34d2697766
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/i386/translate.c

  Log Message:
  -----------
  target/i386: optimize cross-page direct jumps in softmmu

Instead of unconditionally exiting to the exec loop, use the
gen_jr helper to jump to the target if it is valid.

Perf impact: see next commit's log.

Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: b4aa297781ceddef79deb0e99da7817551fa89f8
      
https://github.com/qemu/qemu/commit/b4aa297781ceddef79deb0e99da7817551fa89f8
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/i386/translate.c

  Log Message:
  -----------
  target/i386: optimize indirect branches

Speed up indirect branches by jumping to the target if it is valid.

Softmmu measurements (see later commit for user-mode numbers):

Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.

-                  SPECint06 (test set), x86_64-softmmu (Ubuntu 16.04 guest). 
Host: Intel i7-4790K @ 4.00GHz

 2.4x 
+-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                         
                                         |
      |   cross                                                                 
                                         |
 2.2x 
+cross+jr..........................................................................+++...........................+-+
      |                                                                         
          |                              |
      |                                                                         
      +++ |                              |
   2x 
+-+..............................................................................|..|............................+-+
      |                                                                         
       |  |                              |
      |                                                                         
       |  |                              |
 1.8x 
+-+..............................................................................|####...........................+-+
      |                                                                         
       |# |#                             |
      |                                                                         
     **** |#                             |
 1.6x 
+-+............................................................................*.|*.|#...........................+-+
      |                                                                         
     * |* |#                             |
      |                                                                         
     * |* |#                             |
 1.4x 
+-+.......................................................................+++..*.|*.|#...........................+-+
      |                                                      ++++++             
#### * |*++#             +++             |
      |                        +++                            |  |              
#++# *++*  #          +++ |              |
 1.2x 
+-+......................###.....####....+++............|..|...........****..#.*..*..#....####...|.###.....####..+-+
      |        +++          **** #  ****  #    ####          ***###          
*++*  # *  *  #    #++#  ****|#  +++#++#    |
      |    ****###     +++  *++* #  *++*  #  ++#  #    ####  *|* |#     +++  *  
*  # *  *  #  ***  #  *| *|#  ****  #    |
   1x 
+-++-*++*++#++***###++*++*+#++*+-*++#+****++#++***++#+-*+*++#-+****##++*++*-+#+*++*-+#++*+*++#++*-+*+#++*++*++#-++-+
      |    *  *  #  * *  #  *  * #  *  *  # *  *  #  * *  #  *|* |#  *++* #  *  
*  # *  *  #  * *  #  *  * #  *  *  #    |
      |    *  *  #  * *  #  *  * #  *  *  # *  *  #  * *  #  *+*++#  *  * #  *  
*  # *  *  #  * *  #  *  * #  *  *  #    |
 0.8x 
+-+--****###--***###--****##--****###-****###--***###--***###--****##--****###-****###--***###--****##--****###--+-+
   astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf 
omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/DU36YFU

NB. 'cross' represents the previous commit.

Reviewed-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 6f1653180f5701c6a8f1b35b89a80b1e3260928e
      
https://github.com/qemu/qemu/commit/6f1653180f5701c6a8f1b35b89a80b1e3260928e
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M include/exec/tb-hash.h

  Log Message:
  -----------
  tb-hash: improve tb_jmp_cache hash function in user mode

Optimizations to cross-page chaining and indirect branches make
performance more sensitive to the hit rate of tb_jmp_cache.
The constraint of reserving some bits for the page number
lowers the achievable quality of the hashing function.

However, user-mode does not have this requirement. Thus,
with this change we use for user-mode a hashing function that
is both faster and of better quality than the previous one.

Measurements:

Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0.

-                           SPECint06 (test set), x86_64-linux-user. Host: 
Intel i7-6700K @ 4.00GHz

 2.2x 
+-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                         
                                         |
      |         jr                                                              
                                         |
   2x +jr+multhash        
+....................................................+++++...................................+-+
      |    jr+hash                                                              
|$$$                                     |
      |                                                                         
|$+$                                     |
      |                                                                        
### $                                     |
 1.8x 
+-+......................................................................#|#.$...................................+-+
      |                                                                      
++#+# $                                     |
      |                                                                       
|# # $                                     |
 1.6x 
+-+....................................................................***.#.$....................++$$$..........+-+
      |                                         $$$                          
*+* # $                     |$+$            |
      |                       ++$$$           ### $                          * 
* # $                  +++|$ $            |
      |                     ++###+$           # # $                          * 
* # $           ###   ****## $            |
 1.4x 
+-+...................***+#.$.........***.#.$..........................*.*.#.$...........#+#$$.*++*|#.$..........+-+
      |                     *+* # $         * * # $                          * 
* # $           # # $ *  *+# $            |
      |                     * * # $   +++++ * * # $                          * 
* # $         *** # $ *  * # $   ###$$    |
 1.2x 
+-+...................*.*.#.$.***##$$.*.*.#.$..........................*.*.#.$.........*.*.#.$.*..*.#.$.***+#+$..+-+
      |                     * * # $ *+* # $ * * # $   +++                    * 
* # $ ++###$$ * * # $ *  * # $ * * # $    |
      |    ***##$$          * * # $ * * # $ * * # $ ***##$$          ++###   * 
* # $ *** #+$ * * # $ *  * # $ * * # $    |
      |    *+*+#+$ ***##$$$ * * # $ * * # $ * * # $ *+* # $ ++####$$ ***+#   * 
* # $ * * # $ * * # $ *  * # $ * * # $    |
   1x 
+-++-*+*+#+$+*+*+#-+$+*+*-#+$+*+*+#+$+*+*+#+$+*-*+#+$+***++#+$+*+*+#$$+*+*+#+$+*+*+#+$+*+*-#+$+*+-*+#+$+*+*+#+$-++-+
      |    * * # $ * * #  $ * * # $ * * # $ * * # $ * * # $ * *  # $ * * # $ * 
* # $ * * # $ * * # $ *  * # $ * * # $    |
      |    * * # $ * * #  $ * * # $ * * # $ * * # $ * * # $ * *  # $ * * # $ * 
* # $ * * # $ * * # $ *  * # $ * * # $    |
 0.8x 
+-+--***##$$-***##$$$-***##$$-***##$$-***##$$-***##$$-***###$$-***##$$-***##$$-***##$$-***##$$-****##$$-***##$$--+-+
   astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf 
omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/4UXTrEc

Here I also tried the hash function suggested by Paolo ("multhash"):

  return ((uint64_t) (pc * 2654435761) >> 32) & (TB_JMP_CACHE_SIZE - 1);

As you can see it is just as good as the other new function ("hash"),
which is what I ended up going with.

-                          SPECint06 (train set), x86_64-linux-user. Host: 
Intel i7-6700K @ 4.00GHz

 2.6x 
+-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                         
                                         |
      |     jr                                                                  
                         ###             |
 2.4x 
+jr+hash...........................................................................................#.#...........+-+
      |                                                                         
                         # #             |
      |                                                                         
                         # #             |
 2.2x 
+-+................................................................................................#.#...........+-+
      |                                                                         
                         # #             |
      |                                                                         
                         # #             |
   2x 
+-+................................................................................................#.#...........+-+
      |                                                                         
                      **** #             |
      |                                                                         
                      *  * #             |
 1.8x 
+-+.............................................................................................*..*.#...........+-+
      |                                                                         
+++                   *  * #             |
      |                                                                         
####    ####          *  * #             |
 1.6x 
+-+......................................####.............................#..#.****..#..........*..*.#...........+-+
      |                        +++             #++#                          
****  # *  *  #    ####  *  * #             |
      |                        ###             #  #                          *  
*  # *  *  #    #  #  *  * #             |
 1.4x 
+-+...................****+#..........****..#..........................*..*..#.*..*..#....#..#..*..*.#...........+-+
      |                     *++* #          *  *  #                          *  
*  # *  *  #  ***  #  *  * #     ####    |
      |                     *  * #     #### *  *  #                          *  
*  # *  *  #  * *  #  *  * #  ****  #    |
 1.2x 
+-+...................*..*.#..****++#.*..*..#..........................*..*..#.*..*..#..*.*..#..*..*.#..*..*..#..+-+
      |    ****###          *  * #  *  *  # *  *  #                          *  
*  # *  *  #  * *  #  *  * #  *  *  #    |
      |    *  *  #  ***###  *  * #  *  *  # *  *  #                  ****##  *  
*  # *  *  #  * *  #  *  * #  *  *  #    |
   1x 
+-+--****###--***###--****##--****###-****###--***###--***###--****##--****###-****###--***###--****##--****###--+-+
   astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf 
omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/ArCbHqo

-                                    NBench, x86_64-linux-user. Host: Intel 
i7-6700K @ 4.00GHz

 1.12x 
+-+-------------------------------------------------------------------------------------------------------------+-+
       |                                                                        
                                         |
       |     jr                                                           +++   
                                         |
  1.1x 
+jr+hash...........................................................####.........................................+-+
       |                                                               +++#| #  
                                         |
       |                                                                | #++#  
                                         |
 1.08x 
+-+................................+++................+++.+++..*****..#.........................................+-+
       |                                   |  +++             |   |   * | *  #  
                                         |
       |                                   |   |              |   |   *+++*  #  
                                         |
 1.06x 
+-+................................****###.............|...|...*...*..#.........................+++.............+-+
       |                                  *| * |#            ****###  *   *  #  
                        |                |
       |                                  *| *++#            *| * |#  *   *  #  
                      ####               |
 1.04x 
+-+................................*++*..#............*|.*.|#..*...*..#........................#.|#.............+-+
       |                                  *  *  #            *++*++#  *   *  #  
                   +++#++#               |
       |                                  *  *  #            *  *  #  *   *  #  
                    | #  #   +++####     |
 1.02x 
+-+................................*..*..#......+++...*..*..#..*...*..#.....................****..#..*****++#...+-+
       |         +++                      *  *  #   +++ |    *  *  #  *   *  #  
+++                *| *  #  *+++*  #     |
       |      +++ |    +++ +++   ++++++   *  *  #  *****###  *  *  #  *   *  #  
 |  +++   ++++++   *++*  #  *   *  #     |
    1x 
+-++-+++++####++****###++++-+####+-*++*++#-+*+++*-+#++*++*++#++*+-+*++#+-+++####-+*****###++*++*++#++*+-+*++#+-++-+
       |     *****| #  *++* |#  *****| #  *  *  #  *   *++#  *  *  #  *   *  #  
**** |#  *   *  #  *  *  #  *   *  #     |
       |     * | *| #  *  *++#  * | *++#  *  *  #  *   *  #  *  *  #  *   *  #  
*| *++#  *   *  #  *  *  #  *   *  #     |
 0.98x 
+-+...*.|.*++#..*..*..#..*+++*..#..*..*..#..*...*..#..*..*..#..*...*..#..*++*..#..*...*..#..*..*..#..*...*..#...+-+
       |     *+++*  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
       |     *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
 0.96x 
+-+---*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###---+-+
       ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONEURAL 
NNUMERIC SOSTRING SORT     hmean
  png: http://imgur.com/ZXFX0hJ

-                                   NBench, arm-linux-user. Host: Intel 
i7-4790K @ 4.00GHz

  1.3x 
+-+-------------------------------------------------------------------------------------------------------------+-+
       |                            ####                                        
                                         |
       |     jr                     #  #                                        
    +++                                  |
 1.25x 
+jr+hash.....................#..#...........................................####................................+-+
       |                            #  #                                        
   #  #                                  |
       |                            #  #                                        
   #  #                                  |
  1.2x 
+-+..........................#..#...........................................#..#................................+-+
       |                            #  #                                        
   #  #                                  |
       |                            #  #                                        
   #  #                                  |
 1.15x 
+-+..........................#..#...........................................#..#................................+-+
       |                            #  #                                  ####  
   #  #                                  |
       |                            #  #                                  #  #  
   #  #                                  |
  1.1x 
+-+..........................#..#..................................#..#.....#..#................................+-+
       |                            #  #                                  #  #  
   #  #                         +++      |
       |                            #  #               ####               #  #  
   #  #                         ####     |
 1.05x 
+-+..........................#..#...............#..#.....####......#..#.....#..#.........................#..#...+-+
       |                            #  #               #  #     #  #      #  #  
   #  #                +++      #  #     |
       |                   +++  *****  #     ####  *****  #     #  #   +++#  #  
****  #            ****###      #  #     |
    1x 
+-++-+*****###++****+++++*+-+*++#+-****++#-+*+++*-+#+++++#++#++*****++#+-*++*++#-+*****-++++*++*++#++*****++#+-++-+
       |     *   *  #  *  * |   *   *  #  *  *  #  *   *  #  ****  #  *   *  #  
*  *  #  *   *###  *  *++#  *   *  #     |
       |     *   *  #  *  *###  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
 0.95x 
+-+...*...*..#..*..*.|#..*...*..#..*..*..#..*...*..#..*..*..#..*...*..#..*..*..#..*...*..#..*..*..#..*...*..#...+-+
       |     *   *  #  *  * |#  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
       |     *   *  #  *  * |#  *   *  #  *  *  #  *   *  #  *  *  #  *   *  #  
*  *  #  *   *  #  *  *  #  *   *  #     |
  0.9x 
+-+---*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###--****###--*****###---+-+
       ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONEURAL 
NNUMERIC SOSTRING SORT     hmean
  png: http://imgur.com/FfD27ey

Reviewed-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 0c240785a843f70df5116cea2652c6c042ade36b
      
https://github.com/qemu/qemu/commit/0c240785a843f70df5116cea2652c6c042ade36b
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/ppc/tcg-target.h
    M tcg/ppc/tcg-target.inc.c

  Log Message:
  -----------
  tcg/ppc: Implement goto_ptr

Signed-off-by: Richard Henderson <address@hidden>


  Commit: b19f0c2e7d344d4d62daf554951acdb6c94a34b0
      
https://github.com/qemu/qemu/commit/b19f0c2e7d344d4d62daf554951acdb6c94a34b0
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/aarch64/tcg-target.h
    M tcg/aarch64/tcg-target.inc.c

  Log Message:
  -----------
  tcg/aarch64: Implement goto_ptr

Measurements:
                 SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit 
ARMv8 (Atlas/A57) @ 2.4 GHz

 1.45x 
+-+-------------------------------------------------------------------------------------------------------------+-+
       |                                      *****                             
                                         |
       |      +++                             *   *                             
                       +goto-ptr         |
  1.4x 
+-+...*****............................*...*....................................................................+-+
       |     *+++*                            *   *                            
+++                                       |
 1.35x 
+-+...*...*............................*...*...........................*****....................................+-+
       |     *   *                            *   *                           
*+++*                                      |
       |     *   *                            *   *                           * 
  *                                      |
  1.3x 
+-+...*...*............................*...*...........................*...*....................................+-+
       |     *   *                            *   *                           * 
  *                                      |
       |     *   *                            *   *                           * 
  *                    *****             |
 1.25x 
+-+...*...*...........*****............*...*...........................*...*............*****...*...*...........+-+
       |     *   *           *   *            *   *                           * 
  *            *+++*   *   *             |
  1.2x 
+-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...........+-+
       |     *   *           *   *            *   *                           * 
  *            *   *   *   *             |
       |     *   *           *   *            *   *                           * 
  *            *   *   *   *   *****     |
 1.15x 
+-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...*...*...+-+
       |     *   *           *   *            *   *                           * 
  *    +++     *   *   *   *   *   *     |
       |     *   *           *   *            *   *                           * 
  *   *****    *   *   *   *   *   *     |
  1.1x 
+-+...*...*...........*...*....*****...*...*...*****...................*...*...*...*....*...*...*...*...*...*...+-+
       |     *   *           *   *    *   *   *   *   *   *                   * 
  *   *   *    *   *   *   *   *   *     |
 1.05x 
+-+...*...*...........*...*....*...*...*...*...*...*...................*...*...*...*....*...*...*...*...*...*...+-+
       |     *   *   *****   *   *    *   *   *   *   *   *                   * 
  *   *   *    *   *   *   *   *   *     |
       |     *   *   *   *   *   *    *   *   *   *   *   *   *****   *****   * 
  *   *   *    *   *   *   *   *   *     |
    1x 
+-+---*****---*****---*****----*****---*****---*****---*****---*****---*****---*****----*****---*****---*****---+-+
    astar   bzip2     gcc    gobmk h264ref   hmmlibquantum     mcf 
omnetpperlbench    sjenxalancbmk   hmean
  png: http://imgur.com/en9HE8L

Tested-by: Emilio G. Cota <address@hidden>
Reviewed-by: Aurelien Jarno <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 38f81dc5938fb7025531c5ed602afd41fef799a7
      
https://github.com/qemu/qemu/commit/38f81dc5938fb7025531c5ed602afd41fef799a7
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/sparc/tcg-target.h
    M tcg/sparc/tcg-target.inc.c

  Log Message:
  -----------
  tcg/sparc: Implement goto_ptr

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 46644483cae978c734460131bb1d9071f813b287
      
https://github.com/qemu/qemu/commit/46644483cae978c734460131bb1d9071f813b287
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/s390/tcg-target.h
    M tcg/s390/tcg-target.inc.c

  Log Message:
  -----------
  tcg/s390: Implement goto_ptr

Tested-by: Aurelien Jarno <address@hidden>
Reviewed-by: Aurelien Jarno <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 702a947484eb3e615183dafc93de590ab0679f60
      
https://github.com/qemu/qemu/commit/702a947484eb3e615183dafc93de590ab0679f60
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/arm/tcg-target.inc.c

  Log Message:
  -----------
  tcg/arm: Clarify tcg_out_bx for arm4 host

In theory this would re-enable usage of QEMU on an armv4 host.
Whether this is worthwhile is debatable -- we've been unconditionally
issuing the armv5t BX instruction in the prologue since 2011 without
complaint.  Possibly we should simply require an armv6 host.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 085c648bef7301eabe7d4a3301c8d012ae4423b8
      
https://github.com/qemu/qemu/commit/085c648bef7301eabe7d4a3301c8d012ae4423b8
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/arm/tcg-target.h
    M tcg/arm/tcg-target.inc.c

  Log Message:
  -----------
  tcg/arm: Implement goto_ptr

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 5786e0683c4f8170dd05a550814b8809d8ae6d86
      
https://github.com/qemu/qemu/commit/5786e0683c4f8170dd05a550814b8809d8ae6d86
  Author: Aurelien Jarno <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M tcg/mips/tcg-target.h
    M tcg/mips/tcg-target.inc.c

  Log Message:
  -----------
  tcg/mips: implement goto_ptr

Reviewed-by: Philippe Mathieu-Daudé <address@hidden>
Signed-off-by: Aurelien Jarno <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 6350001e831defb5ede5337baaa7dc4a730c1508
      
https://github.com/qemu/qemu/commit/6350001e831defb5ede5337baaa7dc4a730c1508
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/s390x/translate.c

  Log Message:
  -----------
  target/s390: Use tcg_gen_lookup_and_goto_ptr

Tested-by: Aurelien Jarno <address@hidden>
Reviewed-by: Aurelien Jarno <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 4137cb83fa24eb81c4468eddf717d5257bfdfe5a
      
https://github.com/qemu/qemu/commit/4137cb83fa24eb81c4468eddf717d5257bfdfe5a
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/hppa/translate.c

  Log Message:
  -----------
  target/hppa: Use tcg_gen_lookup_and_goto_ptr

Signed-off-by: Richard Henderson <address@hidden>


  Commit: e78722368c721f3c5b8109ed525adac1653ae97b
      
https://github.com/qemu/qemu/commit/e78722368c721f3c5b8109ed525adac1653ae97b
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/arm/translate-a64.c

  Log Message:
  -----------
  target/aarch64: optimize cross-page direct jumps in softmmu

Perf numbers in next commit's log.

Signed-off-by: Emilio G. Cota <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: e75449a346bf558296966a44277bfd93412c6da6
      
https://github.com/qemu/qemu/commit/e75449a346bf558296966a44277bfd93412c6da6
  Author: Emilio G. Cota <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/arm/translate-a64.c

  Log Message:
  -----------
  target/aarch64: optimize indirect branches

Measurements:

[Baseline performance is that before applying this and the previous commit]

-                                    NBench, aarch64-softmmu. Host: Intel 
i7-4790K @ 4.00GHz

 1.7x 
+-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                         
                                         |
      |   cross                                                                 
                                         |
 1.6x 
+cross+jr.................................................####...................................................+-+
      |                                                         #++#            
                                         |
      |                                                         #  #            
                                         |
 1.5x 
+-+...................................................*****..#...................................................+-+
      |                                                     *+++*  #            
                                         |
      |                                                     *   *  #            
                                         |
 1.4x 
+-+...................................................*...*..#...................................................+-+
      |                                                     *   *  #            
                                         |
      |                                     #####           *   *  #            
                                         |
 1.3x 
+-+................................****+++#...........*...*..#...................................................+-+
      |                                  *++*   #           *   *  #            
                                         |
      |                                  *  *   #           *   *  #            
                                         |
 1.2x 
+-+................................*..*...#...........*...*..#...................................................+-+
      |                                  *  *   #           *   *  #            
                                         |
      |                            ####  *  *   #           *   *  #            
                                         |
 1.1x 
+-+.......................+++#..#..*..*...#...........*...*..#...................................................+-+
      |                         ****  #  *  *   #           *   *  #            
                            ****####     |
      |                         *  *  #  *  *   #           *   *  #  ****###   
+++####            ****###  *  *   #     |
   1x 
+-++-++++++-++++****###++-*++*++#++*++*+-+#++****+++++*+++*++#++*++*-+#++*****++#++****###-++*++*-+#++*+-*+++#+-++-+
      |     *****###  *  *  #   *  *  #  *  *   #  *++*###  *   *  #  *  *  #  
*   *  #  *  *++#   *  *  #  *  *   #     |
      |     *   *++#  *  *  #   *  *  #  *  *   #  *  *  #  *   *  #  *  *  #  
*   *  #  *  *  #   *  *  #  *  *   #     |
 0.9x 
+-+---*****###--****###---****###--****####--****###--*****###--****###--*****###--****###---****###--****####---+-+
      ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONNEURAL 
NUMERIC SORSTRING SORT    hmean
  png: http://imgur.com/qO9ubtk
NB. cross here represents the previous commit.

-                            SPECint06 (test set), aarch64-linux-user. Host: 
Intel i7-4790K @ 4.00GHz

 1.5x 
+-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                       
*****                                      |
      |                                                                       
*+++*                           jr         |
      |                                                                       * 
  *                                      |
 1.4x 
+-+.....................................................................*...*.....................+++............+-+
      |                                                                       * 
  *                      |               |
      |                                      *****                            * 
  *                      |               |
      |                                      *   *                            * 
  *                    *****             |
 1.3x 
+-+....................................*...*............................*...*....................*.|.*...........+-+
      |                       +++            *   *                            * 
  *                    * | *             |
      |                      *****           *   *                            * 
  *                    *+++*             |
      |                      *   *           *   *                            * 
  *                    *   *             |
 1.2x 
+-+....................*...*...........*...*............................*...*...........*****....*...*...........+-+
      |     *****            *   *           *   *                            * 
  *           *   *    *   *    +++      |
      |     *   *            *   *           *   *                            * 
  *           *   *    *   *   *****     |
      |     *   *            *   *   *****   *   *                            * 
  *           *   *    *   *   *   *     |
 1.1x 
+-+...*...*............*...*...*...*...*...*............................*...*....+++....*...*....*...*...*...*...+-+
      |     *   *            *   *   *   *   *   *                            * 
  *   *****   *   *    *   *   *   *     |
      |     *   *            *   *   *   *   *   *   *****                    * 
  *   *   *   *   *    *   *   *   *     |
      |     *   *   *****    *   *   *   *   *   *   *   *   ******           * 
  *   *   *   *   *    *   *   *   *     |
   1x 
+-++-+*+++*-++*+++*++++*+-+*+++*-++*+++*-++*+++*+++*++-*++++*-++*****+++*++-*+++*++-*+++*+-+*++++*+++*++-*+++*+-++-+
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *+++*   * 
  *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   * 
  *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   * 
  *   *   *   *   *    *   *   *   *     |
 0.9x 
+-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
   astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf 
omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/3Dp4vvq

-                           SPECint06 (train set), aarch64-linux-user. Host: 
Intel i7-4790K @ 4.00GHz

 1.7x 
+-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                         
                                         |
      |                                                                         
                              jr         |
 1.6x 
+-+...............................................................................................+++............+-+
      |                                                                         
                       *****             |
      |                                                                         
                       *+++*             |
      |                                                                         
                       *   *             |
 1.5x 
+-+..............................................................................................*...*...........+-+
      |                                                                        
+++                     *   *             |
      |                                                                       
*****                    *   *             |
 1.4x 
+-+.....................................................................*+++*....................*...*...........+-+
      |                                                                       * 
  *                    *   *             |
      |                                      *****                            * 
  *                    *   *             |
      |                                      *   *                            * 
  *   *****            *   *             |
 1.3x 
+-+....................................*...*............................*...*...*...*............*...*...........+-+
      |                       +++            *   *                            * 
  *   *   *            *   *             |
      |                      *****           *   *                            * 
  *   *   *   *****    *   *             |
 1.2x 
+-+....................*...*...........*...*............................*...*...*...*...*+++*....*...*...*****...+-+
      |                      *   *           *   *                            * 
  *   *   *   *   *    *   *   *+++*     |
      |     *****            *   *   *****   *   *                            * 
  *   *   *   *   *    *   *   *   *     |
      |     *   *            *   *   *+++*   *   *                            * 
  *   *   *   *   *    *   *   *   *     |
 1.1x 
+-+...*...*............*...*...*...*...*...*............................*...*...*...*...*...*....*...*...*...*...+-+
      |     *   *   *****    *   *   *   *   *   *                    *****   * 
  *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *    +++    ******   *+++*   * 
  *   *   *   *   *    *   *   *   *     |
   1x 
+-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
   astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf 
omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/vRrdc9j

Signed-off-by: Emilio G. Cota <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: d9a9acde64b862107933f9e9a01435e51bf8f91b
      
https://github.com/qemu/qemu/commit/d9a9acde64b862107933f9e9a01435e51bf8f91b
  Author: Aurelien Jarno <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/mips/translate.c

  Log Message:
  -----------
  target/mips: optimize cross-page direct jumps in softmmu

Cc: Yongbok Kim <address@hidden>
Signed-off-by: Aurelien Jarno <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: e350d8ca3ac7e31c6af71a4ab74d2442dfefc697
      
https://github.com/qemu/qemu/commit/e350d8ca3ac7e31c6af71a4ab74d2442dfefc697
  Author: Aurelien Jarno <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/mips/translate.c

  Log Message:
  -----------
  target/mips: optimize indirect branches

Cc: Yongbok Kim <address@hidden>
Signed-off-by: Aurelien Jarno <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: bec5e2b97572d23360fb08ad9cb9c93b449a25f6
      
https://github.com/qemu/qemu/commit/bec5e2b97572d23360fb08ad9cb9c93b449a25f6
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/alpha/translate.c

  Log Message:
  -----------
  target/alpha: Implement WTINT inline

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 2d826cdc8a43ba1817a44f481f8dc8f08668b0a6
      
https://github.com/qemu/qemu/commit/2d826cdc8a43ba1817a44f481f8dc8f08668b0a6
  Author: Richard Henderson <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M target/alpha/translate.c

  Log Message:
  -----------
  target/alpha: Use goto_tb for fallthru between TBs

Signed-off-by: Richard Henderson <address@hidden>


  Commit: a0d4aac7467dd02e5657b79e867f067330266a24
      
https://github.com/qemu/qemu/commit/a0d4aac7467dd02e5657b79e867f067330266a24
  Author: Peter Maydell <address@hidden>
  Date:   2017-06-05 (Mon, 05 Jun 2017)

  Changed paths:
    M configure
    M cpu-exec.c
    M include/exec/exec-all.h
    M include/exec/tb-hash.h
    M include/qemu/atomic.h
    M target/alpha/translate.c
    M target/arm/translate-a64.c
    M target/arm/translate.c
    M target/arm/translate.h
    M target/hppa/translate.c
    M target/i386/translate.c
    M target/mips/translate.c
    M target/nios2/translate.c
    M target/s390x/translate.c
    M tcg-runtime.c
    M tcg/README
    M tcg/aarch64/tcg-target.h
    M tcg/aarch64/tcg-target.inc.c
    M tcg/arm/tcg-target.h
    M tcg/arm/tcg-target.inc.c
    M tcg/i386/tcg-target.h
    M tcg/i386/tcg-target.inc.c
    M tcg/ia64/tcg-target.h
    M tcg/mips/tcg-target.h
    M tcg/mips/tcg-target.inc.c
    M tcg/ppc/tcg-target.h
    M tcg/ppc/tcg-target.inc.c
    M tcg/s390/tcg-target.h
    M tcg/s390/tcg-target.inc.c
    M tcg/sparc/tcg-target.h
    M tcg/sparc/tcg-target.inc.c
    M tcg/tcg-op.c
    M tcg/tcg-op.h
    M tcg/tcg-opc.h
    M tcg/tcg-runtime.h
    M tcg/tcg.c
    M tcg/tcg.h
    M tcg/tci/tcg-target.h

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170605' into staging

Queued TCG patches

# gpg: Signature made Mon 05 Jun 2017 17:48:42 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <address@hidden>"
# gpg:                 aka "Richard Henderson <address@hidden>"
# gpg:                 aka "Richard Henderson <address@hidden>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170605: (26 commits)
  target/alpha: Use goto_tb for fallthru between TBs
  target/alpha: Implement WTINT inline
  target/mips: optimize indirect branches
  target/mips: optimize cross-page direct jumps in softmmu
  target/aarch64: optimize indirect branches
  target/aarch64: optimize cross-page direct jumps in softmmu
  target/hppa: Use tcg_gen_lookup_and_goto_ptr
  target/s390: Use tcg_gen_lookup_and_goto_ptr
  tcg/mips: implement goto_ptr
  tcg/arm: Implement goto_ptr
  tcg/arm: Clarify tcg_out_bx for arm4 host
  tcg/s390: Implement goto_ptr
  tcg/sparc: Implement goto_ptr
  tcg/aarch64: Implement goto_ptr
  tcg/ppc: Implement goto_ptr
  tb-hash: improve tb_jmp_cache hash function in user mode
  target/i386: optimize indirect branches
  target/i386: optimize cross-page direct jumps in softmmu
  target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr
  target/arm: optimize indirect branches
  ...

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/199e19ee538e...a0d4aac7467d

reply via email to

[Prev in Thread] Current Thread [Next in Thread]