qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 10/15] Hexagon (target/hexagon) instructions with multiple de


From: Richard Henderson
Subject: Re: [PATCH 10/15] Hexagon (target/hexagon) instructions with multiple definitions
Date: Thu, 25 Mar 2021 10:24:39 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1

On 3/24/21 8:50 PM, Taylor Simpson wrote:
Instructions with multiple definitions require special handling
because the generator wants to create a helper, but helpers can
only return a single result.  Therefore, we must override the
generated code.

The following instructions are added
     A4_addp_c        Rdd32 = add(Rss32, Rtt32, Px4):carry
                          Add with carry
     A4_subp_c        Rdd32 = sub(Rss32, Rtt32, Px4):carry
                          Sub with carry
     A5_ACS           Rxx32,Pe4 = vacsh(Rss32, Rtt32)
                          Add compare and select elements of two vectors
     A6_vminub_RdP    Rdd32,Pe4 = vminub(Rtt32, Rss32)
                          Vector min of bytes
     F2_invsqrta      Rd32,Pe4 = sfinvsqrta(Rs32)
                          Square root approx
     F2_sfrecipa      Rd32,Pe4 = sfrecipa(Rs32, Rt32)
                          Recripocal approx

One thing at a time. This is no longer port bring-up where large patches are unavoidable.


+int arch_recip_lookup(int index)
+{
+    index &= 0x7f;
+    unsigned const int roundrom[128] = {

static const uint16_t?  or is it in fact all 8-bit data?

+int arch_invsqrt_lookup(int index)
+{
+    index &= 0x7f;
+    unsigned const int roundrom[128] = {

Likewise.

+/*
+ * Add or subtract with carry.
+ * Predicate register is used as an extra input and output.
+ * r5:4 = add(r1:0, r3:2, p1):carry
+ */
+#define fGEN_TCG_A4_addp_c(SHORTCODE) \
+    do { \
+        TCGv LSB = tcg_temp_new(); \
+        TCGv_i64 LSB_i64 = tcg_temp_new_i64(); \
+        TCGv_i64 tmp_i64 = tcg_temp_new_i64(); \
+        TCGv tmp = tcg_temp_new(); \
+        tcg_gen_add_i64(RddV, RssV, RttV); \
+        fLSBOLD(PxV); \
+        tcg_gen_extu_i32_i64(LSB_i64, LSB); \
+        tcg_gen_add_i64(RddV, RddV, LSB_i64); \
+        gen_carry_from_add64(tmp_i64, RssV, RttV, LSB_i64); \
+        tcg_gen_extrl_i64_i32(tmp, tmp_i64); \
+        f8BITSOF(PxV, tmp); \
+        tcg_temp_free(LSB); \
+        tcg_temp_free_i64(LSB_i64); \
+        tcg_temp_free_i64(tmp_i64); \
+        tcg_temp_free(tmp); \
+    } while (0)

You might as well implement this properly with tcg_gen_add2_i64.

+
+/* r5:4 = sub(r1:0, r3:2, p1):carry */
+#define fGEN_TCG_A4_subp_c(SHORTCODE) \
+    do { \
+        TCGv LSB = tcg_temp_new(); \
+        TCGv_i64 LSB_i64 = tcg_temp_new_i64(); \
+        TCGv_i64 tmp_i64 = tcg_temp_new_i64(); \
+        TCGv tmp = tcg_temp_new(); \
+        tcg_gen_not_i64(tmp_i64, RttV); \
+        tcg_gen_add_i64(RddV, RssV, tmp_i64); \
+        fLSBOLD(PxV); \
+        tcg_gen_extu_i32_i64(LSB_i64, LSB); \
+        tcg_gen_add_i64(RddV, RddV, LSB_i64); \
+        gen_carry_from_add64(tmp_i64, RssV, tmp_i64, LSB_i64); \

Likewise.

Ignoring the rest.  Too large.

r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]