grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [grub PATCH] efinet: disable MNP background polling


From: Laszlo Ersek
Subject: Re: [grub PATCH] efinet: disable MNP background polling
Date: Wed, 14 Oct 2015 00:21:29 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 10/13/15 23:49, Daniel Kiper wrote:
> Hi Laszlo,
> 
> First of all, thanks a lot for very nice explanation!
> 
> On Thu, Oct 01, 2015 at 01:50:31PM +0200, Laszlo Ersek wrote:
>> CC'ing Mark Salter, and edk2-devel, also updating the subject slightly
>> for better context.
>>
>> On 10/01/15 11:26, HATAYAMA Daisuke wrote:
>>> Currently, as of the commit f348aee7b33dd85e7da62b497a96a7319a0bf9dd,
>>> SNP is exclusively reopened to avoid slowdown or complete failure to
>>> load files due to races between MNP background polling and grub's
>>> receiving and transmitting packets.
>>>
>>> This exclusive SNP reopen affects some EFI applications/drivers that
>>> use SNP and causes PXE boot on such systems to fail. Hardware filter
>>> issue fixed by the commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73 is
>>> one example.
>>
>> I think the above two commit references are in inverse order. That is,
>> commit 49426e9f is the one responsible for the (needed) exclusive open,
>> and commit f348aee7 fixes up the former by reconfiguring the receive
>> filters.
>>
>> In other words, the first two paragraphs here seem accurate, just please
>> reorder the commit hashes.
>>
>>> The difficulty here is that effects of the exclusive SNP reopen
>>> differs from system to system depending their UEFI firmware
>>> implementations. One system works well with the commit
>>> f348aee7b33dd85e7da62b497a96a7319a0bf9dd only, another system still
>>> needs the commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73, and there
>>> could be other systems where PXE boot still fails.
>>
>> Here too I think the commit hashes should be switched around.
>>
>>>
>>> Actually, PXE boot fails on Fujitsu PRIMEQUEST 2000 series now.
>>>
>>> Thus, it seems to me that the exclusive SNP reopen makes grub
>>> maintenance difficult by affecting a variety of systems differently.
>>
>> (Admittedly, this is going to be a bit of speculation:) I think the
>> difference in behavior is not due to SNP implementations, but due to
>> differences in higher level protocol implementations -- i.e., the
>> behavior depends on what those drivers do to the underlying SNP that are
>> *closed*.
>>
>> According to the spec of OpenProtocol(), in case of a successful
>> exclusive open, the other BY_DRIVER reference is kicked off with
>> DisconnectController(), which in turn calls the other driver's
>> EFI_DRIVER_BINDING_PROTOCOL.Stop() function.
>>
>> So, the question is what that *other* (higher level) driver does in its
>> Stop() function, when it unbinds the handle with the underlying SNP
>> interface. Does it mess with SNP's receive filters? And so on.
>>
>>>
>>> Instead, the idea of this patch is to disable MNP background polling
>>> temporarily while grub uses SNP. Then, the race issue must be removed.
>>
>> This is not the right approach in my opinion.
>>
>> The original problem is that GRUB's direct access to SNP races with
>> *several* MNP instances to the same SNP. The MNP instances are correctly
>> arbitrated between each other (see more on this later), but the SNP
>> accesses from GRUB are not.
>>
>> SNP is meant as an exclusive-access interface to the NIC, so those
>> parallel clients won't work.
>>
>> One way to solve that is to kick out those other SNP accessors, by way
>> of exclusive open. This is correct from a UEFI driver model perspective,
>> but -- unless GRUB does full reconfiguration on the SNP level -- brittle
>> in practice, as you've experienced; apparently different implementations
>> of higher level protocols leave the SNP in different states when they go
>> away. (It's perfectly possible that some of those driver binding Stop()
>> functions have bugs.)
>>
>> However, the other way (because there is another way, yes) is different
>> from interfering with existing MNP instances (note: plural). MNP (and a
>> bunch of other networking related protocols) don't work like your usual
>> UEFI device drivers; they are child protocols (= protocol interfaces on
>> child handles) that are created *as needed* with the help of Service
>> Binding protocol instances.
>>
>> Please read the following sections of the UEFI spec (v2.5) carefully:
>>
>> - 2.5.8 EFI Services Binding
>> - 10.6 EFI Service Binding Protocol
>> - 24.1 EFI Managed Network Protocol
>>   EFI_MANAGED_NETWORK_SERVICE_BINDING_PROTOCOL
> 
> Taking into account above and sentences in UEFI spec (v2.5) like "Once the
> remote image is successfully loaded, it may utilize the 
> EFI_PXE_BASE_CODE_PROTOCOL
> interfaces, or even the EFI_SIMPLE_NETWORK_PROTOCOL interfaces, to continue
> the remote process." I have a feeling that UEFI spec is very imprecise how
> to use SNP. I have not seen any single word saying that there are any 
> constraints
> on SNP usage (am I missing something?). I heard about that in our internal
> discussions but I was not convinced that it is true until I have seen your
> email with all details. So, now I think that UEFI spec should have special
> paragraph saying how to play with SNP. What do you think about that?

I think that a request for clarification should be filed with the USWG,
or at least raised on edk2-devel :)

> By the way, I saw that other boot loaders like PXELINUX or iPXE use SNP.
> Do you suggest that all of them should be rewritten to avoid issues with
> this protocol?

I don't know these projects overly well :), but I wouldn't suggest such
an indiscriminate rewrite for each. The main incentive for thinking
about MNP at all is that the exclusive open of SNP, which *should* work,
hangs various proprietary UEFI implementations.

When Mark Salter pinged me on IRC quite a few months back about the
original grub problem, the first thing that I recommended (although
clearly not immediately -- I had to research the spec) was the exclusive
open. Mark went ahead and implemented that, and grub started working on
the UEFI platform we had. (Then he contributed the patch(es) to upstream.)

The exclusive open is a valid (and simple) thing to do, according to the
UEFI spec. So the alternative to the elaborate MNP(SB) patch is to debug
and fix the UEFI implementations on which the current (otherwise
spec-conformant) grub code -- the exclusive open -- exposes a hang /
crash etc.

The people who are likely most interested in either fixing those
problematic UEFI implementations, *or* in converting the various open
source projects to MNP(SB), should be those that want to run the latter
on the former. :) Hatayama-san is certainly at liberty to solve the
issue either way.

Thanks!
Laszlo

> 
> Daniel
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]