grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [grub PATCH] efinet: disable MNP background polling


From: Seth Goldberg
Subject: Re: [grub PATCH] efinet: disable MNP background polling
Date: Tue, 13 Oct 2015 17:43:36 -0700


> On Oct 13, 2015, at 3:21 PM, Laszlo Ersek <address@hidden> wrote:
> 
>> On 10/13/15 23:49, Daniel Kiper wrote:
>> Hi Laszlo,
>> 
>> First of all, thanks a lot for very nice explanation!
>> 
>>> On Thu, Oct 01, 2015 at 01:50:31PM +0200, Laszlo Ersek wrote:
>>> CC'ing Mark Salter, and edk2-devel, also updating the subject slightly
>>> for better context.
>>> 
>>>> On 10/01/15 11:26, HATAYAMA Daisuke wrote:
>>>> Currently, as of the commit f348aee7b33dd85e7da62b497a96a7319a0bf9dd,
>>>> SNP is exclusively reopened to avoid slowdown or complete failure to
>>>> load files due to races between MNP background polling and grub's
>>>> receiving and transmitting packets.
>>>> 
>>>> This exclusive SNP reopen affects some EFI applications/drivers that
>>>> use SNP and causes PXE boot on such systems to fail. Hardware filter
>>>> issue fixed by the commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73 is
>>>> one example.
>>> 
>>> I think the above two commit references are in inverse order. That is,
>>> commit 49426e9f is the one responsible for the (needed) exclusive open,
>>> and commit f348aee7 fixes up the former by reconfiguring the receive
>>> filters.
>>> 
>>> In other words, the first two paragraphs here seem accurate, just please
>>> reorder the commit hashes.
>>> 
>>>> The difficulty here is that effects of the exclusive SNP reopen
>>>> differs from system to system depending their UEFI firmware
>>>> implementations. One system works well with the commit
>>>> f348aee7b33dd85e7da62b497a96a7319a0bf9dd only, another system still
>>>> needs the commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73, and there
>>>> could be other systems where PXE boot still fails.
>>> 
>>> Here too I think the commit hashes should be switched around.
>>> 
>>>> 
>>>> Actually, PXE boot fails on Fujitsu PRIMEQUEST 2000 series now.
>>>> 
>>>> Thus, it seems to me that the exclusive SNP reopen makes grub
>>>> maintenance difficult by affecting a variety of systems differently.
>>> 
>>> (Admittedly, this is going to be a bit of speculation:) I think the
>>> difference in behavior is not due to SNP implementations, but due to
>>> differences in higher level protocol implementations -- i.e., the
>>> behavior depends on what those drivers do to the underlying SNP that are
>>> *closed*.
>>> 
>>> According to the spec of OpenProtocol(), in case of a successful
>>> exclusive open, the other BY_DRIVER reference is kicked off with
>>> DisconnectController(), which in turn calls the other driver's
>>> EFI_DRIVER_BINDING_PROTOCOL.Stop() function.
>>> 
>>> So, the question is what that *other* (higher level) driver does in its
>>> Stop() function, when it unbinds the handle with the underlying SNP
>>> interface. Does it mess with SNP's receive filters? And so on.
>>> 
>>>> 
>>>> Instead, the idea of this patch is to disable MNP background polling
>>>> temporarily while grub uses SNP. Then, the race issue must be removed.
>>> 
>>> This is not the right approach in my opinion.
>>> 
>>> The original problem is that GRUB's direct access to SNP races with
>>> *several* MNP instances to the same SNP. The MNP instances are correctly
>>> arbitrated between each other (see more on this later), but the SNP
>>> accesses from GRUB are not.
>>> 
>>> SNP is meant as an exclusive-access interface to the NIC, so those
>>> parallel clients won't work.
>>> 
>>> One way to solve that is to kick out those other SNP accessors, by way
>>> of exclusive open. This is correct from a UEFI driver model perspective,
>>> but -- unless GRUB does full reconfiguration on the SNP level -- brittle
>>> in practice, as you've experienced; apparently different implementations
>>> of higher level protocols leave the SNP in different states when they go
>>> away. (It's perfectly possible that some of those driver binding Stop()
>>> functions have bugs.)
>>> 
>>> However, the other way (because there is another way, yes) is different
>>> from interfering with existing MNP instances (note: plural). MNP (and a
>>> bunch of other networking related protocols) don't work like your usual
>>> UEFI device drivers; they are child protocols (= protocol interfaces on
>>> child handles) that are created *as needed* with the help of Service
>>> Binding protocol instances.
>>> 
>>> Please read the following sections of the UEFI spec (v2.5) carefully:
>>> 
>>> - 2.5.8 EFI Services Binding
>>> - 10.6 EFI Service Binding Protocol
>>> - 24.1 EFI Managed Network Protocol
>>>  EFI_MANAGED_NETWORK_SERVICE_BINDING_PROTOCOL
>> 
>> Taking into account above and sentences in UEFI spec (v2.5) like "Once the
>> remote image is successfully loaded, it may utilize the 
>> EFI_PXE_BASE_CODE_PROTOCOL
>> interfaces, or even the EFI_SIMPLE_NETWORK_PROTOCOL interfaces, to continue
>> the remote process." I have a feeling that UEFI spec is very imprecise how
>> to use SNP. I have not seen any single word saying that there are any 
>> constraints
>> on SNP usage (am I missing something?). I heard about that in our internal
>> discussions but I was not convinced that it is true until I have seen your
>> email with all details. So, now I think that UEFI spec should have special
>> paragraph saying how to play with SNP. What do you think about that?
> 
> I think that a request for clarification should be filed with the USWG,
> or at least raised on edk2-devel :)
> 
>> By the way, I saw that other boot loaders like PXELINUX or iPXE use SNP.
>> Do you suggest that all of them should be rewritten to avoid issues with
>> this protocol?
> 
> I don't know these projects overly well :), but I wouldn't suggest such
> an indiscriminate rewrite for each. The main incentive for thinking
> about MNP at all is that the exclusive open of SNP, which *should* work,
> hangs various proprietary UEFI implementations.
> 
> When Mark Salter pinged me on IRC quite a few months back about the
> original grub problem, the first thing that I recommended (although
> clearly not immediately -- I had to research the spec) was the exclusive
> open. Mark went ahead and implemented that, and grub started working on
> the UEFI platform we had. (Then he contributed the patch(es) to upstream.)
> 
> The exclusive open is a valid (and simple) thing to do, according to the
> UEFI spec. So the alternative to the elaborate MNP(SB) patch is to debug
> and fix the UEFI implementations on which the current (otherwise
> spec-conformant) grub code -- the exclusive open -- exposes a hang /
> crash etc.
> 
> The people who are likely most interested in either fixing those
> problematic UEFI implementations, *or* in converting the various open
> source projects to MNP(SB), should be those that want to run the latter
> on the former. :) Hatayama-san is certainly at liberty to solve the
> issue either way.

  The folks I've talked to at Intel say that the bug is that the snp exclusive 
open should fail if any other consumer has already opened snp, and that is the 
case with existing implementations (that is, mnp has already opened snp by the 
time grub is loaded and running).  If that bug were fixed, the grub snp driver 
would always fail which would make it pretty obvious that mnp is the way to go. 

  --S



> 
> Thanks!
> Laszlo
> 
>> 
>> Daniel
> 
> 
> _______________________________________________
> Grub-devel mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/grub-devel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]