grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About firmware facilities


From: Vladimir 'phcoder' Serbinenko
Subject: Re: About firmware facilities
Date: Sun, 20 Sep 2009 12:38:13 +0200
User-agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090701)

Brendan Trotter wrote:
> Hi,
>
> On Sat, Sep 19, 2009 at 11:36 PM, Vladimir 'phcoder' Serbinenko
> <address@hidden> wrote:
>   
>> Brendan Trotter wrote:
>>     
>>>> No. Usuable means only that firmware isn't destroyed. Any device may
>>>> be in a different state
>>>>         
>>> Any device (that the firmware assumes is in a certain state) may be
>>> left in a different state (that the firmware no longer knows about)?
>>>
>>> For a very simple example, imagine if the BIOS leaves the floppy motor
>>> on, and GRUB's own floppy driver uses the floppy and then turns the
>>> motor off. Then the OS uses the firmware to read from floppy, but the
>>> firmware thinks the floppy motor is still on and attempts to read from
>>> the floppy without turning the floppy motor on.
>>>
>>> If GRUB has it's own device drivers, and GRUB doesn't restore devices
>>> to the state that the firmware expects the devices to be in, then the
>>> firmware is unusable.
>>>
>>>       
>> Most OSes should use their own drivers to access devices.
>>     
>
> Most device manufacturers should provide full documentation so that
> programmers can write drivers to access the devices; and manufacturers
> should provide hardware samples (and documentation) to these
> programmers so that the device driver is ready before the device is
> made available to the general public. Unfortunately the real world
> just doesn't work the same as "should".
>
> The multi-boot specification says the firmware is left in a usable
> state. If GRUB doesn't leave the firmware in a usable state, then
> either GRUB is wrong or the multi-boot specification is wrong. You
> can't have it both ways.
>
> Of course I'm forgetting that GRUB also supports chainloading (e.g.
> the chainloaded OS tries to use the firmware to load more of it's
> data, and the firmware fails because GRUB left a device in an
> unexpected state) - non-compliance with the multi-boot specification
> isn't the only issue.
>
>   
Well we do have some sanitasation code as grub_stop_floppy. If you see
somewhere it's not enough, submit patch
>>> If an OS can't use the firmware, then the OS must rely on GRUB for
>>> everything instead, including strange "OS specific" things that nobody
>>> has seen any other OS do before.
>>>
>>>       
>> If nobody uses a particular feature in firmware then you shouldn't use
>> it either. Unused firmware features are often buggy. Moreover firmware
>> on x86 is useful only for bootstrap and once bootstrap is completed you
>> should forget it exists except some firmware-specific tasks as setting
>> boot device.
>>     
>
> So, can I rely on GRUB to (for e.g.) setup video in a way that is
> suitable for my code, or do I need to use the firmware myself (and
> hope that GRUB hasn't left a device in an unexpected state)?
>
>   
You can rely on GRUB once I implement it.
>>>>>>> Due to limitations in the original multi-boot specification my code
>>>>>>> switches back to real mode and uses the BIOS to do memory detection,
>>>>>>> do video mode detection, switch video modes and gather other
>>>>>>> information.
>>>>>>>
>>>>>>>               
>>>>>> Have you actually read the multiboot specification? Booter passes info
>>>>>> about memory and video mode in mbi (video for multiboot isn't
>>>>>> implemented yet). If you need firmware for basic bootup you're clearly
>>>>>> doing something wrong and are firmware-dependent. Of course it's your
>>>>>> freedom to make suboptimal software.
>>>>>>
>>>>>>             
>>>>> I've read the multi-boot specification. I've also read the code in
>>>>> GRUB-legacy that does memory detection, and I'm unwilling to allow my
>>>>> code to rely on it for "quality control" reasons. Without going into
>>>>> details, GRUB-legacy tends to do a "minimal" job and then expects the
>>>>> user to fix the problem if/when it goes wrong (but even then it only
>>>>> offers a "uppermem" command without providing a way for the user to
>>>>> specify a complete system memory map).
>>>>>
>>>>>
>>>>>           
>>>> What is "minimal job" and "quality control"? We use standard
>>>> E820+(optionally)badram command. I've seen no OS do any more than
>>>> this.
>>>>
>>>>         
>>> My code tries "int 0x15, eax=0xE820" expecting 24 bytes per area (ACPI
>>> 3.0); then it tries "int 0x15, eax=0xE820" expecting 20 bytes per
>>> area. If "int 0x15, eax=0xE820" isn't supported by the BIOS then you
>>> can assume it's an old computer (and old computers are painful).
>>>
>>> It tries "int 0x15, ax=0xE801", then "int 0x15, ah=0xC7", then "int
>>> 0x15, ah=0x8A", then "int 0x15, ah=0xDA88", then "int 0x15, ah=0x88",
>>> then CMOS locations 0x70 and 0x71.
>>>       
>> Read code. GRUB fallback to old methods if newer aren't available.
>>     
>
> I read the code (for both GRUB 1.96 and GRUB 0.97) and wrote down
> exactly which BIOS functions GRUB does use in my last post. You didn't
> read the code (and didn't read what I wrote either), and now you're
> telling me to read the code?
>
>   
GRUB2 already uses fallback chain. I'm ok with having more fallbacks but
only as long as they are sane.
> Yes. For old computers there's plenty of unused area in the physical
> address space and PCI devices are almost always assigned areas
> starting from higher addresses and working down (which leaves a
> massive "unused" area between the end of RAM and the first memory
> mapped PCI device. For older systems (with ISA) the only usable space
> is the "ISA hole" just below 0x01000000 (I already explained that my
> code does probe this area).
>
>   
You can't rely on BIOS mmap'ing the same way on all old computers.


> and function correctly on almost all old BIOSs (where GRUB
> currently fails) and function correctly
Maintaining support for very old computers is pain. I'm ok to add the
code as fallback (except probably manual probing) but not to claim we
support old computers
>  on almost all buggy BIOSs
> (where GRUB currently fails).
>
>   
If you speak about sorting and overlapping region then grub resolves
them in mmap.mod but multiboot spec doesn't require mmap to be sorted.
>>> For all BIOS functions used my code avoids all known BIOS bugs (and
>>> there's plenty of them). This includes "sanitizing" the data returned
>>> from "int 0x15, eax=0xE820" - sorting the list and handling any
>>> overlapping areas.
>>>
>>>       
>> have you ever looked at mmap folder?
>>     
>
> There is no mmap folder in the source code for GRUB 1.96 or GRUB 0.97.
>
>   
SVN?
>>> I've never needed to provide a way for the end-user to override my
>>> memory detection.
>>>
>>>       
>> Neither did we. But test your manual probing at 4GiB system - it's
>> likely to detect all MMIO addresses as RAM.
>>     
>
> It's extremely unlikely that a computer with 4 GiB of RAM will fail on
> all of the previous BIOS functions. If all of the previous BIOS
> functions do fail, then you're probably running on an 80486 or older
> computer which is unlikely to have more than 128 MiB of RAM.
>
>   
It's unlikely to have more than 64 MiB either. The problem is that mobos
may also have a bug similar to GateA20 and causes the same memory to be
detected twice at 2 or more different addresses. That's another reason
against manual probing. I'm ok to accept more BIOS functions fallback
but manual probing is bad. GRUB2 should be able to work even without all
memory detected and if you want to add manual probing you can supply an
external module which uses mmap.mod
> If you think GRUB's memory detection never needs to be overridden then
> you're obviously not testing it on anything that predates "int 0x15,
> eax = 0xE820".
>
>   
>>>>> For memory detection, ACPI 3.0 allows the BIOS (" INT 15H, E820H") to
>>>>> return extended attributes - mostly only a volatile/non-volatile flag.
>>>>> This isn't in GRUB's information. ACPI 3.0 also allows the BIOS to
>>>>> return areas of the type "AddressRangeUnusable" (e.g. faulty RAM).
>>>>>
>>>>>           
>>>> This is mostly unnecessary. Basically you need only to know if you can
>>>> use a memory range or not. The only useful additional code would be
>>>> ReclaimMemory
>>>>         
>>> To handle standby states correctly the OS may need to know which areas
>>> are volatile and which areas aren't (which can include knowing the
>>> difference between volatile system areas and non-volatile system
>>> areas). Some OSs also want to know if there's any faulty RAM present
>>> in the system or not (and additional information about any area
>>> reserved for "hot-plug" RAM, and NUMA ranges, but that information
>>> comes from ACPI tables not BIOS functions so the OS can get this
>>> information without GRUB).
>>>
>>>       
>> I'm ok with defining additional types in multiboot1. But OS considering
>> multiboot type to be BIOS type is buggy
>>     
>
> I agree - most OSs that use multi-boot are buggy because they don't
> comply with the specification (except mine, because I ignore GRUB's
> memory map and get the information directly from the BIOS). The
> question is which new types would be needed to ensure that
> non-compliance isn't "deemed necessary" by OS developers in the
> future, and how GRUB will know if the kernel image will understand the
> new types correctly or if the kernel is an older (buggy) kernel that
> (incorrectly) assumes ACPI types.
>
>   
We don't support buggy kernels. But I'm ok to either not use values 2-5
at all or use them only same way as BIOS does. Please go ahead and write
a patch for multiboot texinfo and post it into separate thread
>>>> GRUB can't do this right now because it doesn't recieve badram info
>>>> soon enough. And even if it does most kernels expect first MiB to be
>>>> usable.
>>>>
>>>>         
>>> You're right - all kernels that are designed to use "multi-boot
>>> specification version 1" expect to be loaded at 0x00100000 and that
>>> RAM below the EBDA is usable. I'm not sure what kernels designed for
>>> "multi-boot specification version 2" expect...
>>>
>>>
>>>       
>> Read what I said
>>     
>
> In which way does existing kernels (that were designed for
> GRUB-legacy) include future kernels (that might be designed to support
> features that have been/could be introduced with GRUB2)?
>
>   
multiboot1 is still supported.
>>>> Such list is a blatant encapsulation breach. If you want such test,
>>>> add it to bootloader, not OS.
>>>>         
>>> When the OS is running and detects a RAM fault, you want the OS to run
>>> a copy of GRUB (maybe inside an emulator or something) so the OS can
>>> tell GRUB about the RAM fault, and GRUB can tell the OS if the RAM
>>> fault might cause problems if the computer is rebooted (and so the OS
>>> can send an email to the network administrators or something *before*
>>> the computer is rebooted)?
>>>
>>> You can't assume that the OS that is running is the same OS that
>>> installed GRUB; or that the OS that is running has access to wherever
>>> GRUB is installed; or that GRUB will be able to detect any faulty RAM
>>> during boot.
>>>
>>>
>>>       
>> You're in circular logic. You assume that booter is using faulty RAM but
>> supplying RAM it used correctly.
>>     
>
> No. All RAM is OK when the boot loader boots the OS, but then
> (possibly several months of running "24 hours per day" later) a RAM
> fault occurs and the OS detects it, and the OS tells the user (or
> administrator) that rebooting might cause problems due to the RAM
> fault (because the OS knows that the faulty RAM will be used by the
> boot loader).
>
>   
It doesn't. Which RAM is used depends on things like version or even
command order. I prefer to have full badram support in booter.
>>>> This information is available with a simple loop over mbi. I would
>>>> rathjer avoid overcomplicating the standard because it increases a
>>>> chance of having "half-compliant" OSes and "half-compliant" booters.
>>>>
>>>>         
>>> I'd rather have "fully compliant" OSes that are easier to write than
>>> "fully compliant" OSes that are a pain in the neck to write because
>>> you have to parse everything in the multi-boot information structure
>>> before you can write to any RAM (except for your own ".bss").
>>>
>>> If I can't rely on the firmware (like I currently do) then I have to
>>> rely on GRUB, and have to copy everything from the multi-boot
>>> information structure into my ".bss". So, how much extra space do I
>>> need to allow in my ".bss"? What is the maximum number of drive
>>> structures? What's the maximum number of memory map entries? What's
>>> the maximum length of the command line? The multi-boot specification
>>> doesn't say.
>>>
>>>
>>>       
>> First do a small parse and count how many memory the structures you need
>> to take.
>>     
>
> For something like a "live" CD; during boot you want the OS to do a
> small parse and determine how much memory these structures will take,
> then write to a read-only boot CD to change the kernel's ".bss" size?
> And you want the OS to do this before GRUB has allocated memory for
> the kernel or executed any of the OS's code?
>
>   
I meant parsing MBI near entry point and taking account of it further
for memory allocations.
>>>>> The "Boot device" field in the multi-boot information structure should
>>>>> be improved to handle a variety of conditions; including if the disk
>>>>> was an emulated disk (e.g. "El Torito" emulating a hard drive). The
>>>>> BIOS drive number isn't much use (especially if the firmware is
>>>>> coreboot, UEFI, OpenFirmware, etc), and should be replaced with
>>>>> something that identifies the corresponding drive structure (this
>>>>> includes USB).
>>>>>
>>>>>           
>>>> Boot device shouldn't be used at all. It was a mistake. Booter has no
>>>> good way to know how OS will see the device. You should pass this
>>>> parameter via commandline either as device name or UUID. You have
>>>> scripting to automate this
>>>>
>>>>         
>>> A user who's using GRUB to boot Ubuntu decides to install my code in
>>> another partition, then modify GRUB's configuration (in Ubuntu) so
>>> that GRUB will also boot my code. Now my code needs to rely on the
>>> user to not stuff up GRUB's configuration for my code?
>>>
>>>       
>> You can simply tell him to add "source" line
>>     
>
> How does my code know if the user has set this "source" line correctly?
>
> If someone is making a bootable CD that's meant to be used on 100
> different computers, how should they set the "source" line?
>
>   
source includes your script. You can ensure your script to be correct.
>>>>> The OS image also needs a different magic number to indicate that the
>>>>> OS image is designed for future versions of the multi-boot
>>>>> specification (rather than the old/current version). If the OS image
>>>>> uses the new magic number, then the OS image must also include an
>>>>> "version of the multi-boot specification that this image complies
>>>>> with" field. If the OS image indicates that it's intended for a newer
>>>>> version of the multi-boot specification than the boot loader complies
>>>>> with, then the boot loader refuses to boot and displays a "this boot
>>>>> loader needs to be upgraded" error. If the OS image has the old magic
>>>>> number, and if the firmware is "PC BIOS" then the boot loader should
>>>>> boot the old OS image. If the OS image has the old magic number, and
>>>>> if the firmware is not "PC BIOS" then the boot loader refuses to boot
>>>>> and displays a "this OS requires a PC BIOS" error message.
>>>>>
>>>>>           
>>>> Already implemented through feature fields
>>>>         
>>> Is there a private version of the multi-boot specification that I'm
>>> not aware of yet; or does GRUB fail to comply with the current
>>> multi-boot specification?
>>>
>>>       
>> No. Read specification again
>>     
>
> The current multi-boot specification (version 0.6.95)? The one at:
> http://www.gnu.org/software/grub/manual/multiboot/multiboot.html?
>
> For the "flags" field in the kernel's Multiboot header, this version
> of the specification says "Naturally, all as-yet-undefined bits in the
> `flags' word must be set to zero in OS images." and there are no flags
> defined that allows a kernel to indicate that it supports other types
> of firmware (or any other feature/s introduced by GRUB2).
>   
Every payload is supposed to work on any firmware.
> Can I assume that one or more of these "as-yet-undefined" bits have
> been defined in some private version of the multi-boot specification
> that I'm not aware of?
>   
No

You speak about many things in a single thread. In this cases many of
things tend to be forgotten. Can you make precise patches for things I'm
ok with (I'm not maintainer so not the one who decides) and put other
ideas in separate threads (one idea per thread)







reply via email to

[Prev in Thread] Current Thread [Next in Thread]