[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify
From: |
Laszlo Ersek |
Subject: |
Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM |
Date: |
Wed, 26 Aug 2020 15:32:41 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 Thunderbird/52.9.1 |
On 08/26/20 15:32, Laszlo Ersek wrote:
> On 08/26/20 11:24, Laszlo Ersek wrote:
>> Hi Igor,
>>
>> On 08/25/20 19:25, Laszlo Ersek wrote:
>>
>>> So I would suggest fetching the CNEW array element back into "uid"
>>> first, then using "uid" for both the NOTIFY call, and the (currently
>>> missing) restoration of CSEL. Then we can write 1 to CINS.
>>>
>>> Expressed as a patch on top of yours:
>>>
>>>> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
>>>> index 4864c3b39694..2bea6144fd5e 100644
>>>> --- a/hw/acpi/cpu.c
>>>> +++ b/hw/acpi/cpu.c
>>>> @@ -564,8 +564,11 @@ void build_cpus_aml(Aml *table, MachineState
>>>> *machine, CPUHotplugFeatures opts,
>>>> aml_append(method, aml_store(zero, cpu_idx));
>>>> while_ctx = aml_while(aml_lless(cpu_idx, num_added_cpus));
>>>> {
>>>> - aml_append(while_ctx, aml_call2(CPU_NOTIFY_METHOD,
>>>> - aml_derefof(aml_index(new_cpus, cpu_idx)), dev_chk));
>>>> + aml_append(while_ctx,
>>>> + aml_store(aml_derefof(aml_index(new_cpus, cpu_idx)),
>>>> uid));
>>>> + aml_append(while_ctx,
>>>> + aml_call2(CPU_NOTIFY_METHOD, uid, dev_chk));
>>>> + aml_append(while_ctx, aml_store(uid, cpu_selector));
>>>> aml_append(while_ctx, aml_store(one, ins_evt));
>>>> aml_append(while_ctx, aml_increment(cpu_idx));
>>>> }
>>>
>>> This effects the following change, in the decompiled method:
>>>
>>>> @@ -37,15 +37,17 @@
>>>> If ((Local_NumAddedCpus != Zero))
>>>> {
>>>> \_SB.PCI0.SMI0.SMIC = 0x04
>>>> }
>>>>
>>>> Local_CpuIdx = Zero
>>>> While ((Local_CpuIdx < Local_NumAddedCpus))
>>>> {
>>>> - CTFY (DerefOf (CNEW [Local_CpuIdx]), One)
>>>> + Local_Uid = DerefOf (CNEW [Local_CpuIdx])
>>>> + CTFY (Local_Uid, One)
>>>> + \_SB.PCI0.PRES.CSEL = Local_Uid
>>>> \_SB.PCI0.PRES.CINS = One
>>>> Local_CpuIdx++
>>>> }
>>>>
>>>> Release (\_SB.PCI0.PRES.CPLK)
>>>> }
>>>
>>> With this change, the
>>>
>>> virsh setvcpus DOMAIN 8 --live
>>>
>>> command works for me. The topology in my test domain has CPU#0 and
>>> CPU#2 cold-plugged, so the command adds 6 VCPUs. Viewed from the
>>> firmware side, the 6 "device_add" commands, issued in close succession
>>> by libvirtd, coalesce into 4 "batches". (And of course the firmware
>>> sees the 4 batches back-to-back.)
>>
>> unfortunately, with more testing, I have run into two more races:
>>
>> (1) When a "device_add" occurs after the ACPI loop collects the CPUS
>> from the register block, but before the SMI.
>>
>> Here, the "stray CPU" is processed fine by the firmware. However,
>> the CTFY loop in ACPI does not know about the CPU, so it doesn't
>> clear the pending insert event for it. And when the firmware is
>> entered with an SMI for the *next* time, the firmware sees the same
>> CPU *again* as pending, and tries to relocate it again. Bad things
>> happen.
>>
>> (2) When a "device_add" occurs after the SMI, but before the firmware
>> collects the pending CPUs from the register block.
>>
>> Here, the firmware collects the "stray CPU". However, the "broadcast
>> SMI", with which we entered the firmware, did *not* cover the stray
>> CPU -- the CPU_FOREACH() loop in ich9_apm_ctrl_changed() could not
>> make the SMI pending for the new CPU, because at that time, the CPU
>> had not been added yet. As a result, when the firmware sends an
>> INIT-SIPI-SIPI to the new CPU, expecting it to boot right into SMM,
>> the new CPU instead boots straight into the post-RSM (normal mode)
>> "pen", skipping its initial SMI handler. Meaning that the CPU halts
>> nicely, but its SMBASE is never relocated, and the SMRAM message
>> exchange with the BSP falls apart.
>>
>> Possible mitigations I can think of:
>>
>> For problem (1):
>>
>> (1a) Change the firmware so it notices that it has relocated the
>> "stray" CPU before -- such CPUs should be simply skipped in the
>> firmware. The next time the CTFY loop runs in ACPI, it will clear
>> the pending event.
>>
>> (1b) Alternatively, stop consuming the hotplug register block in the
>> firmware altogether, and work out general message passing, from
>> ACPI to firmware. See the outline here:
>>
>>
>> http://mid.mail-archive.com/cf887d74-f65d-602a-9629-3d25cef93a69@redhat.com
>>
>> For problem (2):
>>
>> (2a) Change the firmware so that it sends a directed SMI as well to
>> each CPU, just before sending an INIT-SIPI-SIPI. This should be
>> idempotent -- if the broadcast SMI *has* covered the the CPU,
>> then sending a directed SMI should make no difference.
>>
>> (2b) Alternatively, change the "device_add" command in QEMU so that,
>> if "CPU hotplug with SMI" has been negotiated, the new CPU is
>> added with the SMI made pending for it at once. (That is, no
>> hot-plugged CPU would exist with the directed SMI *not* pending
>> for it.)
>>
>> (2c) Alternatively, approach (1b) would fix problem (2) as well -- the
>> firmware would only relocate such CPUs that ACPI collected before
>> injecting the SMI. So all those CPUs would have the SMI pending.
>>
>>
>> I can experiment with (1a) and (2a),
>
> My patches for (1a) and (1b) seem to work -- my workstation has 10
aargh, I meant my patches for (1a) and (2a).
sorry
Laszlo
> PCPUs, and I'm using a guest with 20 possible VCPUs and 2 cold-plugged
> VCPUs on it, for testing. The patches survive the hot-plugging of 18
> VCPUs in one go, or two batches like 9+9. I can see the fixes being
> exercised.
>
> Unless you strongly disagree (or I find issues in further testing), I
> propose that I post these fixes to edk2-devel (they should still be in
> scope for the upcoming release), and that we stick with your current
> patch series for QEMU (v3 -- upcoming, or maybe already posted).
>
> Thanks!
> Laszlo
>
- [PATCH v2 4/7] acpi: add aml_land() and aml_break() primitives, (continued)
- [PATCH v2 4/7] acpi: add aml_land() and aml_break() primitives, Igor Mammedov, 2020/08/18
- [PATCH v2 5/7] tests: acpi: mark to be changed tables in bios-tables-test-allowed-diff, Igor Mammedov, 2020/08/18
- [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Igor Mammedov, 2020/08/18
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Laszlo Ersek, 2020/08/25
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Igor Mammedov, 2020/08/26
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Laszlo Ersek, 2020/08/26
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Igor Mammedov, 2020/08/26
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Laszlo Ersek, 2020/08/26
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM,
Laszlo Ersek <=
- Re: [PATCH v2 6/7] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM, Igor Mammedov, 2020/08/26
[PATCH v2 7/7] tests: acpi: update acpi blobs with new AML, Igor Mammedov, 2020/08/18
Re: [PATCH v2 0/7] x86: fix cpu hotplug with secure boot, no-reply, 2020/08/18
Re: [PATCH v2 0/7] x86: fix cpu hotplug with secure boot, no-reply, 2020/08/18