|
From: | Richard Henderson |
Subject: | Re: [RFC PATCH v2 5/6] target/riscv: rvv: Optimize v[l|s]e8.v with limitations |
Date: | Sun, 2 Jun 2024 12:45:59 -0500 |
User-agent: | Mozilla Thunderbird |
On 5/31/24 12:44, Max Chou wrote:
The vector unit-stride load/store instructions (e.g. vle8.v/vse8.v) perform continuous load/store. We can replace the corresponding helper functions by TCG ops to copy more data at a time with following assumptions: * Perform virtual address resolution once for entire vector at beginning * Without mask * Without tail agnostic * Both host and target are little endian Signed-off-by: Max Chou <max.chou@sifive.com>
Why are you generating all of this inline? This expansion is very large. I would expect you to get better performance with a helper function.
AGAIN, please see the Arm implementation. r~
[Prev in Thread] | Current Thread | [Next in Thread] |