bug-mes
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-mes] mescc and Load/Store architectures


From: Jeremiah
Subject: Re: [bug-mes] mescc and Load/Store architectures
Date: Wed, 13 Feb 2019 22:46:25 +0000

> Yes, but does that imply that there a user for ARM displacements
> already? (the one using the %label and !label)
Yes that would be me in my previous attempt at adding support for ARM to
Mescc-tools before I got pulled off to finish cc_x86.s
It is fine

> I don't want to break it, but I can't think of any place where ARM would
> have 8 bit offsets in branches.  What are displacements used for?
> Branches and what?
Breaking it is fine, it isn't developed enough to be self-hosting yet.

> DEFINE AND_eax_ebx
>   Where's the target?
>     If the target is eax, e0000001.
>     If the target is ebx, e0011000.

The Default target for x86 instructions is EAX (or RAX for AMD64)
because at it's heart it is an accumulator architecture.
I can also change the M1-macro instruction to be much more explicit if
you desire.

> DEFINE CALL_IMMEDIATE
>  That's difficult.  Need to use a scratch register.
That is good because none of the values in any of the registers matter
before a call and you can use any of them. (EAX needs not be trampled
after a return however as it contains the returned value)

>  Note: ARM immediates are 8 bit value and 4 bit rotation.  The above
>  hard-codes the rotation to 0.
We could do a load of a nearby address and a jump over the 4bytes of
immediate values if ARM supports PC relative loads.

> DEFINE CMP
>   Compare what with what?
The default compare is EAX and EBX (R0 and R1).
But you are right, I probably should take the time to make the M1-macro
generated by M2-Planet more explicit.

> DEFINE COPY_edi_to_ebp
>   I have no idea what ARM register would correspond to di.
EDI in that context is a scratch register that needs to not be altered
while values are pushed onto the stack for the function call.

> DEFINE DIVIDE_eax_by_ebx_into_eax
>   ARM division instruction does not exist.  It's possible to use UMULL
>   for a lot of special cases.
We also could add a divide function to the runtime and just call it for
the ARM port.

> DEFINE JUMP e590f000 # LDR %pc, [%r0]
>   Is it immediate and direct?  That would be "B".
>   Is that an absolute jump? Then not "B" :)
That would be a relative jump (like jump over the next 3 instructions or
jump back the last 8 instructions)

> DEFINE JUMP_EQ8
>   What's that?
It is a short form of JUMP_EQ and can be replaced rather easily.

> DEFINE LOAD_BASE_ADDRESS_eax
>   What's that?
x86 has EBP which is usually the pointer to a stack frame and the
instruction loads the absolute address of that piece of memory into EAX

> DEFINE LOAD_EFFECTIVE_ADDRESS
>   CISC
Actually it would translate into ESP+constant into EAX; essentially it is
the only 3op arithmetic instruction in x86. (the other forms just simply
select alternate target registers)

> DEFINE LOAD_ESP_IMMEDIATE_into_eax
>   What does that do?
Actually that one is legacy and can be completely removed (It only
exists as a M1-Macro definition and no where else)

> DEFINE LOAD_INTEGER
>   What does that mean?
That would be using the address in EAX, load the 4bytes of data from
that address into EAX. (The other variants do the exact same thing with
their respective registers)

> DEFINE MOVEZBL 0FB6C0
>   Zero-Extend which register?
>   UXTB would zero-extend one register into another register (can also
>     specify the same register)
That would be EAX, this is because x86 doesn't actually Zero the rest of
the register when it loads a byte. If ARM properly zeros the rest of the
register when loading bytes, this instruction can be ignored entirely.

> DEFINE NOT_eax
>   Is the result supposed to be 2^32 - 1 - eax?
It would be the assembly version of the C code ~EAX (aka bit flip the
result)

> DEFINE RETURN C3
>   That depends on whether you want to emulate x86 semantics or doing it
>     the ARM way.
>   For the ARM way, e1a0f00e mov %pc, %lr.
>   For the x86 semantics: e49df004 pop {%pc}, and I'm not sure that
>     the displacement is correct then.

Which ever way makes it easier for whoever ends up doing the hand
assembly version of M2-Planet in M1-Macro.

> DEFINE SAL_eax_Immediate8
>   No idea what an "arithmetic" LEFT shift does differently compared to a
>   logical left shift.
The difference is on little bit Endian architectures where the left bit
is duplicated. This does not apply to ARM and can safely be ignored.

> DEFINE SETE
>   Which register?
>   ARM has conditions in each instruction, so this need never comes up.
>     You could do e3a00000 # mov %r0, #0
>     and then 03a00001 # moveq %r0, #1

It sets EAX to the value of 1 if the ZERO Flag is set (Aka if the
previous comparision resulted in the 2 values being equal)

And we probably could replace the sequence with a better one for ARM.

> DEFINE STORE_CHAR
>   Store where?
>   How big is a char?
>   Is it assumed that it's an immediate?
>   Is it storing only one byte into the memory cell or is it
>     padding it?
The address in EBX is where it is stored.
It is 8bits in size
It is assumed that it is in EAX.
It is storing only one byte into the memory cell

> DEFINE STORE_INTEGER
>   Store where?
>   How big is an integer?
The address in EBX is where it is stored.
The integer is 32bits in size

> DEFINE XCHG_eax_ebx
>   That... is difficult without a scratch register.
>   I guess push {%r0}, push {%r1}, pop {%r0}, pop {%r1}
or XOR r0 r1 -> r0; XOR r0 r1 -> r1; XOR r0 r1 -> r0


> base &= ~3;
M2-Planet doesn't yet support &= so I replaced it with base = base & ~3;
and your patch has been accepted.

> Oh, and no ARM instructions update flags.
> If one wants that, one has to set a bit indicating that in the instruction.
> I didn't sewt that bit anywhere in the previous e-mail.
Well we could skip setting flags entirely for ARM if you like


> If we want to be "x86 compatible", we could use "adds" instead of "add",
> "subs" instead of "sub" etc for ALU instructions.
> Where would we need that?
The ARM port doesn't have to be compatible any further than absolute
minimal. The only thing that must be absolutely the same is when I pass
--architecture the resulting output *MUST* be identifical, regardless of
the host architecture. (Aka can do any thing you like inside of
M2-Planet but when --architecture x86 or --architecture AMD64, etc; the
results have to be identical to the host native output)

> oops, the values are big numbers, so they can't be used directly in
> mescc-tools for ARM little-endian like this, we'd have to endian-swap
> them.
Big numbers are fine, that is why M1 and hex2 have --LittleEndian

> When I wrote e0811000 I expected the integer 0xe0811000 to be emitted to
> the binary output file, but it's not (for little endian).
Actually it was, the problem is that ARM encodes the instruction in
little Endian format (It threw me very hard) in fact all of the
documentation makes the exact same implicit assumption.

Left to right is low to high bits but that is incorrect; rather Left to
right is High to low bits and thus why
https://github.com/oriansj/mescc-tools/blob/master/test/test11/hello.M1
looks so weird.
The 4bit conditional isn't on the left side but on the right but not at
the end but rather a nybble away from the end.

Blood-elf and objdump -d saved me lots of trouble in finding the real
encodings of instructions.

> hex2 rather takes every 2 consecutive chars as the hex code of one byte
> and emits that.
Yes that is what the spec explicitly requires, such that it is trivial
for humans inspecting hex2 to hand verify the results. I will much
rather make the computer's job harder than the humans who are to check
for malicious code.

> So I would like to instead write: 00 10 81 e0.
> But in M1 DEFINEs it seems that I'm not allowed to enter spaces.
Well we could change the spec in that regard; but we could then have to
forbid all line comments on the same line as Macro definitions. If
janneke is willing to also accept that limitation; I will make that change.

> That's rather unfortunate.  Without having read the hex2 source code I'd
> interpret e0811000 as the number 0xe0811000, which hex2 doesn't.  Spaces
> would make it unambiguous.
It is the number 0xe0811000 in Big Endian format but ARM is little
Endian and thus that result is expected but we could add some logic to
M1-Macro, such that when --LittleEndian is passed, the DEFINES are
flipped as well.


> currently, mescc-tools uses "Architectural"_displacement for all of
> the following prefixes:  ! @ ~ %
Yes to express relative displacement required for loads, stores and jumps

> and it uses an absolute target for all of the following prefixes: $ &
Yes for when architectures require absolute addresses or when one wishes
to hardcode structs with pointers

> Furthermore, blood-elf emits debug information, containing
> %...>... in order to determine sizes.
As janneke suggested as a solution to his immediate problems of putting
the calculation of the size of the binary into the ELF header.

> Wouldn't that cause debug metadata to be architecture-dependent? (right
> now in my local ARM branch I get a failure because the debug info is not
> word-aligned.  Well, it isn't.  I started
> aligning it and then I stopped myself, because what was I doing?  It
> makes no sense to do that!).
Absolutely correct. and Blood-elf may have to be revised to perform
architecture specific padding (and grow an --architecture flag)

> On RISC architectures, the Fetch stage fetching the instructions is
> (mostly) independent of any other stage.  This means that the Fetch
> stage happily marches on (increases program counter)
> no matter what you do, including branches (the program counter will
> eventually be updated for the branch, but only *eventually*).
> Therefore, it would be natural for an
> "Architectural"_displacement to take this into account.  Also, on ARM
> the branch immediate is the number of INSTRUCTIONS, not the number of
> Bytes, to skip.
That however makes for one heck of a problem in mescc-tools

> However, if the data structure size determination is done like blood-elf
> does it, and we took the above into account, then that would break the
> debug info size calculation.  Same for the ELF headers.
Ok, given only the information in a hex2 file, how do we do it?

> Maybe I misunderstood the purpose of Architectural_displacement, but as
> it is now, I don't see how it can specialize per architecture because if
> it was any different than (target - base)
> then it would break size determination by users.
It is a hold-over when we supported an architecture that had it's
displacement calculated from the start of the instruction rather than
the end of the instruction. aka (target - base - 4)

> As a better way, I would suggest to clearly separate which are
> architecture-dependent displacements (to be put into instructions) and
> which are architecture-independent "displacements"
> (sizes and absolute addresses, really) in the user interface.
> I suggest to change storePointer to read:
umm but displacement already is target - base where the > syntax is to
allow an alternative base address. Thus allowing the measurement in
bytes between any 2 labels

> And change size-determining users to do "&foo>bar" instead of "%foo>bar"
> if they mean size determination.
I don't quite see that working. I'd rather change the meaning entirely
of address@hidden& for ARM but honestly I haven't made up my mind yet on
this. Give me more time to think about it.


> on ARM (and many other architectures) it is customary to also pad data
> structures in order to align members.  I don't see any mechanism in hex2
> (and/or M1 - probably not) in order to be
> able to force alignment at the current position.  It would be easy to
> add such a thing, it would just be some pseudo-instruction that would do
> the following (example for 4 Byte alignment):
Well the alignment would have to happen in hex2 and we could make
something like < force hex2 to generate as many zeros as required to
align with some architecture defined rule.


> On ARM, "alignment" is meant in a very low-level way, "least significant
> bits of the address == 0".
> Could we add one?
Absolutely

> This patchset clearly separates architecture-dependent displacements
> from architecture-independent positions and sizes.
> Previously, user were supposed to use "%foo>bar" in order to determine
> sizes.  For unaligned data structures that is not possible.
> Furthermore, after "%" is specialized to emit
> architecture-dependent immediates for branch instructions, this would
> not return the size in bytes anymore (see PATCH 2/2).
> Now, it is supported to use "&foo>bar" in order to determine sizes in
> bytes.
> On i686 and on x86_64 the patchset has no observable effect--however, on
> ARM, it will make it possible to use both branches and ELF headers (and
> debug information) in the same output

I am not sure that is the correct course of action and have not merged
those patches yet.

We literally have G-Z, g-z and the rest of the avaliable ASCII symbols
available as prefixes to express that issue.

Or we could extend hex2 to support !! and @@ and %% for ARM

Or we could finally add ^ and have it treat what follows it as having to
be architecture specific.

Let us play with possible alternate solutions for a couple days to see
what works and will not make the task of converting the C code into
assembly by hand difficult.

-Jeremiah



reply via email to

[Prev in Thread] Current Thread [Next in Thread]