[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [PATCH v14 2/9] ACPI: Add APEI GHES table generation and
From: |
Peter Maydell |
Subject: |
Re: [Qemu-arm] [PATCH v14 2/9] ACPI: Add APEI GHES table generation and CPER record support |
Date: |
Tue, 9 Jan 2018 16:51:31 +0000 |
On 3 January 2018 at 02:21, gengdongjiu <address@hidden> wrote:
> On 2017/12/28 22:18, Igor Mammedov wrote:
>> On Thu, 28 Dec 2017 13:54:11 +0800
>> Dongjiu Geng <address@hidden> wrote:
>>> In order to simulation, we hard code the error
>>> type to Multi-bit ECC.
>> Not sure what this is about, care to elaborate?
>
> please see Memory Error Record in [1], in which the "Memory Error Type" field
> is used to describe the
> error type, such as Multi-bit ECC or Parity Error etc. Because KVM or host
> does not pass the memory
> error type to Qemu, so Qemu does not know what is the error type for the
> memory section. Hence we let QEMU simulate
> the error type to Multi-bit ECC.
>
> [1]:
> UEFI Spec 2.6 Errata A:
>
> "N.2.5 Memory Error Section"
> -----------------+---------------+--------------+-------------------------------------------+
> Mnemonic | Byte Offset | Byte Length | Description
> |
> -----------------+---------------+--------------+-------------------------------------------+
> ........ | ............ | ......... | ...........
> |
> -----------------+---------------+--------------+-------------------------------------------+
> Memory Error Type| 72 | 1 |Identifies the type of error
> that occurred:|
> | | | 0 – Unknown
> |
> | | | 1 – No error
> |
> | | | 2 – Single-bit ECC
> |
> | | | 3 – Multi-bit ECC
> |
> | | | 4 – Single-symbol ChipKill
> ECC |
> | | | 5 – Multi-symbol ChipKill
> ECC |
> | | | 6 – Master abort
> |
> | | | 7 – Target abort
> |
> | | | 8 – Parity Error
> |
> | | | 9 – Watchdog timeout
> |
> | | | 10 – Invalid address
> |
> | | | 11 – Mirror Broken
> |
> | | | 12 – Memory Sparing
> |
> | | | 13 - Scrub corrected error
> |
> | | | 14 - Scrub uncorrected
> error |
> | | | 15 - Physical Memory
> Map-out event |
> | | | All other values reserved.
> |
> -----------------+---------------+--------------+-------------------------------------------+
> ........ | ............ | ......... | ...........
> |
> -----------------+---------------+--------------+-------------------------------------------+
There's a value specified for "we don't know what the error type is",
which is "0 - Unknown". I think we should use that rather than claiming
that we have a particular type of error when we don't actually know that.
I agree with James that we don't want to report a particular type of
error to the guest anyway -- the VM is a virtual environment, and
the exact reason why the hardware/firmware/host kernel have decided
that a bit of RAM isn't usable any more doesn't matter to the guest.
We just want to report "this RAM has gone away, sorry" to it.
thanks
-- PMM