[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/3] target/hppa: mask offset bits in gva
From: |
Sven Schnelle |
Subject: |
Re: [PATCH 2/3] target/hppa: mask offset bits in gva |
Date: |
Sun, 24 Mar 2024 19:41:28 +0100 |
Hi Richard,
Richard Henderson <richard.henderson@linaro.org> writes:
> In particular Figure 2-14 for "data translation disabled" may be
> instructive. Suppose the cpu does not implement all of the physical
> address lines (true for all extant pa-risc cpus; qemu implements 40
> bits to match pa-8500 iirc). Suppose when reporting a trap with
> translation disabled, it is a truncated physical address that is used
> as input to Figure 2-14.
>
> If that is so, then the fix might be in hppa_set_ior_and_isr. Perhaps
>
> - env->cr[CR_ISR] &= 0x3fffffff;
> + env->cr[CR_ISR] &= 0x301fffff;
>
> Though my argument would suggest the mask should be 0xff for the
> 40-bit physical address, which is not what you see at all, so perhaps
> the thing is moot. I am at a loss to explain why or how HP-UX gets a
> 7-bit hole in the ISR result.
>
> On the other hand, there are some not-well-documented shenanigans (aka
> implementation defined behaviour) between Figure H-8 and Figure H-11,
> where the 62-bit absolute address is expanded to a 64-bit logical
> physical address and then compacted to a 40-bit implementation
> physical address.
>
> We've already got hacks in place for this in hppa_abs_to_phys_pa2_w1,
> which just truncates everything down to 40 bits. But that's probably
> not what the processor is really doing.
>
> Anyhow, will you please try the hppa_set_ior_and_isr change and see if
> that fixes your HP-UX problems?
The problem occurs with data address translation - it's working without,
which is not suprising because no exception can happen there. But as
soon as the kernel enables address translation it will hit a data tlb
miss exception because it can't find 0xfffffffffffb0500 in the page
tables. Trying to truncate the ISR in hppa_set_ior_and_isr() for the
data translation enabled case leads to this loop:
hppa_tlb_fill_excp env=0x55bf06e976e0 addr=0x3ffffffffffb0500 size=4 type=0
mmu_idx=9
hppa_tlb_find_entry env=0x55bf06e976e0 ent=0x55bf06e97b30 valid=1 va_b=0x200000
va_e=0x2fffff pa=0x200000
hppa_tlb_get_physical_address env=0x55bf06e976e0 ret=-1 prot=5 addr=0x26170c
phys=0x26170c
hppa_tlb_flush_ent env=0x55bf06e976e0 ent=0x55bf06e97bf0
va_b=0x301ffffffffb0000 va_e=0x301ffffffffb0fff pa=0xfffffffffffb0000
hppa_tlb_itlba env=0x55bf06e976e0 ent=0x55bf06e97bf0 va_b=0x301ffffffffb0000
va_e=0x301ffffffffb0fff pa=0xfffffffffffb0000
hppa_tlb_itlbp env=0x55bf06e976e0 ent=0x55bf06e97bf0 access_id=0 u=1 pl2=0
pl1=0 type=1 b=0 d=0 t=0
So qemu is looking up 0x3ffffffffffb0500 in the TLB, can't find it,
raises an exception, HP-UX says: "ah nice, i have a translation for
you", but that doesn't match because we're only stripping the bits
in the ISR.
As i was a bit puzzled in the beginning what's going on, i dumped the
pagetables and wrote a small dump program:
680000: val=000f47ff301fffff r2=110e0f0000000001 r1=01ffffffffe8ffe0
phys=fffffffff47ff000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
680020: val=000f47fe301fffff r2=110e0f0000000001 r1=01ffffffffe8ffc0
phys=fffffffff47fe000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
680060: val=000f47fc301fffff r2=110e0f0000000001 r1=01ffffffffe8ff80
phys=fffffffff47fc000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5860: val=000fed3c301fffff r2=010e000000000001 r1=01fffffffffda780
phys=fffffffffed3c000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d58e0: val=000fed38301fffff r2=010e000000000001 r1=01fffffffffda700
phys=fffffffffed38000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d59a0: val=000fed32301fffff r2=010e000000000001 r1=01fffffffffda640
phys=fffffffffed32000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d59e0: val=000fed30301fffff r2=110e0f0000000001 r1=01fffffffffda600
phys=fffffffffed30000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a00: val=000fed2f301fffff r2=010e000000000001 r1=01fffffffffda5e0
phys=fffffffffed2f000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a20: val=000fed2e301fffff r2=010e000000000001 r1=01fffffffffda5c0
phys=fffffffffed2e000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a40: val=000fed2d301fffff r2=010e000000000001 r1=01fffffffffda5a0
phys=fffffffffed2d000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a60: val=000fed2c301fffff r2=010e000000000001 r1=01fffffffffda580
phys=fffffffffed2c000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5a80: val=000fed2b301fffff r2=010e000000000001 r1=01fffffffffda560
phys=fffffffffed2b000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5aa0: val=000fed2a301fffff r2=010e000000000001 r1=01fffffffffda540
phys=fffffffffed2a000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ac0: val=000fed29301fffff r2=010e000000000001 r1=01fffffffffda520
phys=fffffffffed29000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ae0: val=000fed28301fffff r2=010e000000000001 r1=01fffffffffda500
phys=fffffffffed28000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b00: val=000fed27301fffff r2=010e000000000001 r1=01fffffffffda4e0
phys=fffffffffed27000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b20: val=000fed26301fffff r2=010e000000000001 r1=01fffffffffda4c0
phys=fffffffffed26000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b40: val=000fed25301fffff r2=010e000000000001 r1=01fffffffffda4a0
phys=fffffffffed25000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b60: val=000fed24301fffff r2=010e000000000001 r1=01fffffffffda480
phys=fffffffffed24000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5b80: val=000fed23301fffff r2=010e000000000001 r1=01fffffffffda460
phys=fffffffffed23000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5ba0: val=000fed22301fffff r2=110e0f0000000001 r1=01fffffffffda440
phys=fffffffffed22000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5bc0: val=000fed21301fffff r2=010e000000000001 r1=01fffffffffda420
phys=fffffffffed21000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5be0: val=000fed20301fffff r2=010e000000000001 r1=01fffffffffda400
phys=fffffffffed20000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5de0: val=000fed10301fffff r2=010e000000000001 r1=01fffffffffda200
phys=fffffffffed10000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7d5fe0: val=000fed00301fffff r2=110e0f0000000001 r1=01fffffffffda000
phys=fffffffffed00000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7f07e0: val=000fffc0301fffff r2=010e000000000001 r1=01fffffffffff800
phys=fffffffffffc0000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
7f09e0: val=000fffb0301fffff r2=110e0f0000000001 r1=01fffffffffff600
phys=fffffffffffb0000 4K aid=1 pl1=0, pl2=0 type=1 (DATA RW)
'val' is the value constructed from IOR/ISR, r1/r2 are the args for the
idtlbt instructions, while the GVA just stays in IOR/ISR. If you look
at the val value, you'll recognize the 0301ffff... part. First i was
assuming some bug when creating the pagetables, but dumping pagetables
on my C3750/J6750 showed the same values.
The fastpath of the fault handler is:
$i_dtlb_miss_2_0
$TLB$:0002a1e0 02 a0 08 a9 mfctl IOR,r9
$TLB$:0002a1e4 d9 21 0a 6c extrd,u,* r9,51,20,r1
$TLB$:0002a1e8 02 80 08 a8 mfctl ISR,r8
$TLB$:0002a1ec 35 18 00 00 copy r8,r24
$TLB$:0002a1f0 f0 28 06 96 depd,* r8,43,10,r1
$TLB$:0002a1f4 d9 11 1a aa extrd,u,* r8,53,54,r17
dtlb_bl_patch_2_0
$TLB$:0002a1f8 e8 00 18 80 b dtlbmss_PCXU
$TLB$:0002a1fc 0a 21 02 91 xor r1,r17,r17
dtlbmss_PCXU
$TLB$:0002ae40 d9 19 03 e0 extrd,u,* r8,31,32,r25
$TLB$:0002ae44 0b 21 02 99 xor r1,r25,r25
$TLB$:0002ae48 f3 19 0c 0c depd,* r25,31,20,r24
pdir_base_patch_017
$TLB$:0002ae4c 20 20 00 0a ldil 0x500000,r1
pdir_shift_patch_017
$TLB$:0002ae50 f0 21 00 00 depd,z,* r1,0x3f,0x20,r1
pdir_mask_patch_017
$TLB$:0002ae54 f0 31 04 a8 depd,* r17,58,24,r1
$TLB$:0002ae58 0c 20 10 d1 ldd 0x0(r1),r17
$TLB$:0002ae5c bf 11 20 5a cmpb,*<>,n r17,r24,d_target_miss_PCXU
$TLB$:0002ae60 50 29 00 20 ldd 0x10(r1),r9
$TLB$:0002ae64 0c 30 10 c8 ldd 0x8(r1),r8
$TLB$:0002ae68 d9 10 02 de extrd,u,* r8,0x16,0x2,r16
$TLB$:0002ae6c 8e 06 20 12 cmpib,<>,n 0x3,r16,make_nop_if_split_TLB_2_0_7
$TLB$:0002ae70 05 09 18 00 idtlbt r9,r8
$TLB$:0002ae74 00 00 0c a0 rfi,r
$TLB$:0002ae78 08 00 02 40 nop
So the patch above was the only thing i could come up with - if you have
any better idea, let me know.
I also patched linux to execute exactly the same instruction with the
same address (space is 0), and i've seen different ISR/IOR values
compared to the values presented when HPUX is running. I think the
only explanation is that HPUX or firmware switches the behaviour
during runtime.
[PATCH 1/3] target/hppa: use gva_offset_mask() everywhere, Sven Schnelle, 2024/03/24
[PATCH 3/3] target/hppa: fix building gva for wide mode, Sven Schnelle, 2024/03/24