My UEFI Explorations
v10Mar2013_2302, HanishKVC
>> 1 << As a End user - Trying to understand the Samsung EFI Firmware Bug
>> 2 << Trying to use a signed loader with Secure boot
>> 3 << FOR LATER - 08Mar2013
>> 1 << As a End user - Trying to understand the Samsung EFI Firmware Bug
1.1 https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
1.1.1 hanishkvc (hanishkvc) wrote on 2013-02-26: #159
Hi Steve/Colin/CluedInPeople,
I have a query related to the FIx. Shouldn't the right fix for this bug if one wants to use secure boot and uefi to boot their system be the noefi boot param related fix added to linux kernels around 15th Feb i.e
commit 1de63d60cd5b0d33a812efa455d5933bf1564a51 upstream.
Because if I understand correctly the issue is less to do with the samsung laptop module per se and more to do with the buggy Samsung EFI logic which craps it self out if one writes to the efi storage space sometimes (or is it always - not 100% sure currently). And in turn if linux kernel crashes or catches a mce for some reason or the other (like previously triggered by the samsung laptop module) and is in efi mode, then because it writes the crash log to efi storage space, this serious bug in samsung efi code gets triggered and potentially take the full laptop down with it.
And if I get things correctly then passing noefi to linux kernel as a boot param will disable the use of efi runtime services by the kernel and its modules. And that is the 100% sure way of ensuring that under linux one cann't trigger this bug in the normal sense (Still is it 100% safe from a security perspective I am not sure if Samsung efi logic doesn't have any loop holes which allows one to call efi services even if one has already relinquished it - I am talking logically here, because I haven't looked into efi in detail so am making some/many assumptions).
So if one wants to dual boot a system with win8 already installed in Secure boot UEFI mode and Linux THEN one should use a distro of linux which is using linux kernels later than Feb 15 with the above mentioned noefi bug fix included and in turn one should boot such a linux distro with noefi boot param to ensure that the Samsung laptops with this efi bug cann't be triggered from Linux during that boot.
Is my above understanding correct and in turn does the new LTS live image use this new kernel, if not shouldn't this be the right solution to this EFI bug in Samsung if one doesn't want to disable secureboot/uefi.
NOTE: I am not sure the linux kernel handles the transition from efi to no efi runtime mode gracefully if noefi is passed as a argument and the system is already in uefi boot mode. But I am assuming for now that the kernel handles this situation properly as well as that it is required to handle this in a specific manner, which it does. This is my assumption currently because I haven't looked into EFI specs at any level currently.
1.1.2 hanishkvc (hanishkvc) wrote on 2013-02-26: #160
Hi All,
Just to be sure, the relavent bug fix I am talking about is this
commit 266c43c175a51002b04c18a453a39708d1775ced
Author: Satoru Takeuchi <email address hidden>
Date: Thu Feb 14 09:12:52 2013 +0900
efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter
1.1.3 hanishkvc (hanishkvc) wrote on 2013-02-27: #162
Hi Steve,
NOTE: After some initial glance across the different things involved here.
If I am not wrong after reading the noefi related bugfix changelog as well as looking at
a) samsung-laptop.c (The original patch for efi bug - return failure on EFI_BOOT being set)
b) efi.c (The changes done on Feb14 - clear EFI_RUNTIME_SERVICES instead of EFI_BOOT if noefi param is passed)
What I understand is that the noefi bugfix of Feb14 was to ensure that samsung-laptop module behaves properly even if noefi parameter was passed. Rather there was a bug in the way noefi parameter was being handled as it was wrongly clearing EFI_BOOT instead of EFI_RUNTIME_SERVICES, and this inturn would have messed with the samsung-laptop module workaround. So by fixing this bug on Feb14, noefi boot parameter no longer conflicts with samsung-laptop module.
So with the latest kernels after Feb14 which include this noefi bugfix, passing noefi boot param should ensure that
a) the samsung-laptop module doesn't poke into legacy smc memory space and create unwanted problems, as well as
b) the kernel doesn't use the efi runtime service and the buggy setvariable call in turn in samsung firmware, what ever the case (including kernel crash).
However if I get the uefi spec on initial glance, the runtime service persists even after calling exitbootservice function (rather more a wrong assumption on my side earlier, as I hadn't looked at uefi spec before - also the setup.c in kernel tells that even bootservices are potentially called even after calling exitbootservices, which technically shouldn't be the case from the uefi spec - no idea why they have put that comment in there, however I do see the need potentially with setvirtualaddress map etc wrt efi and so ... unless I am missing something here).
So as you right fully doubted, if I read the efi related code correctly then calling noefi WILL NOT ensure that a root user cann't get access to runtime service in future, because it doesn't actively try to disable efi tables etc, rather it just doesn't set them up in the new/current linux environment. So a root user could load a kernel module to replicate the functionality of efi.c virtual mapping so that he can set things up as required. Unless something else goes and overwrites the original efi systab/??? (Rather if it is possible to do that - I don't see why not unless uefi runs in a special previlaged mode or so (which I don't think it does, but haven't gone deep sufficiently yet to say one way or the other), then may be we should write a logic to go and overwrite it with nulls or some such thing so that even a root user cann't go and recover the efi runtime in future in the current boot of the system, through a special special boot parameter let us say fullsystem_noefi).
NOTE: And parallely samsung should look at fixing the multiple bugs in their firmware. Do you have any input on when they may fix this issue.
1.1.4 hanishkvc (hanishkvc) wrote on 2013-02-27: #163
Hi Steve/CluedInDevs,
NOTE: I haven't really looked at ACPI and UEFI till date. Only today I have done some initial glance thro UEFI and done some code browsing. So may be I am off by 100 miles, correct me if that is the case.
One thing which I haven't fully cross checked/figured yet is how EFI variable storage is linked to crash dumps. Because I don't see a direct usage of efi variable storage in the kernel code wrt crash dumping.
However there is the ACPI ERST based pstore mechanism. Is it that the ACPI ERST inturn uses EFI variable storage in a UEFI system. Logically it seems very much possible and a proper solution wrt the firmware perspective. However in that case I am not 100% sure that passing noefi will ensure that ACPI erst mechanism wont trigger efi variable storage or other runtime services. Is it that because by passing noefi we are stopping the setting up of proper virtual address mapping for the efi systable related entries, so even if ACPI erst mechanism were to trigger variable storage write, it won't succeed ????? i.e is it that we will be depending on a indirect luck to stop the variable storage write triggering during kernel crash ?
If it is a indirect luck based chance that we are stopping variable storage write during kernel crash dump when noefi is used, then rather we may have to ideally have additional logic added to stop registering of acpi erst pstore when noefi is passed (or when a new boot param like fullsystem_noefi is passed).
1.1.5 hanishkvc (hanishkvc) wrote on 2013-02-27: #164
Hi Steve/CluedInDevs,
Based on bit more grepping, I think I have part of the answer, at the same time another part still seems to be a potential issue !!! However please do correct my understanding, even if I am fully wrong also.
a) I notice that efivars.c has a pstore logic and using noefi will definitely stop this direct efi variable storage access and possible corruption in Samsung UEFI firmware.
HOWEVER
b) On reading thro the ACPI 4.0a spec, as I had assumed in my last post, it allows the platform provider to use the uefi runtime variable service to store APEI ERST error logging on a UEFI based system.
Now as the apei erst pstore is still registered even if noefi is passed. And inturn has this can inturn trigger writing to efi variable storage during a kernel crash or any apei related writes (Is it used anywhere else, I haven't checked yet), CAN this create any problem either directly (i.e it writing to efi variable storage and triggering the samsung efi bug) or indirectly (i.e as the efi virtual address mapping is not setup when noefi is passed, can it lead to random code running in the system in such a circumstance) ????
1.1.6 (Good Summary) hanishkvc (hanishkvc) wrote on 2013-03-02: #165
Hi Steve/All,
Summary of what I have understood after going thro related things over the last few days.
NOTE: This is my understanding, anyone using this info, should cross checking things on their own before experiment on their samsung laptop. Unless others (who understand this fully) can also confirm my understanding to be correct.
NOTE: I am looking at the possibility of installing Ubuntu with UEFI enabled and ideally even Secure boot enabled. So my thoughts are based around that. And also to try and ensure that possibility of the laptop bricking is eliminated as much as possible.
There are potentially 3 known bugs in Samsung UEFI Firmware which can trip up Linux/Ubuntu installation, they are
BUG_A) (As noted by Matthew Garrett) High probability of Corruption of NVRAM / Firmware on using UEFI Runtime Service (RT) SetVariable functionality
In turn This will be triggered in Linux if any kernel crash occurs, as the pstore logic of efivars.c will try to write the crash dump to NVRAM.
POSSIBLE_SOLUTIONFOR_A) Now passing noefi kernel boot param will avoid this direct corruption path in the latest linux kernels (which have the noefi related bugfix - i.e which clears EFI_RUNTIME_SERVICE flag rather than wrongly clearing the EFI_BOOT flag) .
However at this juncture I am not sure if the linux kernel's acpi erst error logging functionality related pstore logic will trigger the same issue or not as this logic is not disabled on passing noefi boot param (However as uefi runtime service SetVirtualAddressMap is not called when noefi is passed - may be it won't create a problem but I am not sure, as I haven't dug sufficiently deep into uefi yet). So may be the safest bet may be to disable pstore logic of acpi erst (apei) in the linux kernel for now (This suggestion is partly due to my lack of knowledge wrt this fully currently, as I haven't dug enough into uefi+acpi interaction and their runtime environment [I don't mean uefi runtime service here] and its implication wrt os runtime).
BUG_B) Existance of the Samsung BIOS's old SMM equivalent handshaking or its vestiges in the newer UEFI firmware, and it leading to MCE.
Now the samsung-laptop module in the linux kernel is dependent on this old mechanism for achieving its functionality, and this inturn would lead to the BUG_A mentioned above in Samsung UEFI firmware, as it would trigger a MCE and kernel crash dump.
POSSIBLE_SOLUTIONFOR_B) AGAIN the latest linux kernels have WORKED AROUND this by disabling the loading of the samsung-laptop module, if the uefi booting is used. NOTE that passing noefi doesn't interfere with proper handling of this work around, as the linux kernel still knows that it was booted using UEFI even if noefi is passed.
BUG_C) The UEFI RT GetNextVariableName service doesn't handle its input parameters properly.
As noted by Jakob Heinemann during his exploration of the Samsung firmware bug, passing a variable name size greater than 128 potentially returns a error from GetNextVariableName (when in reality it shouldn't). However the Linux efivars.c in the kernel is currently written to pass 1024 has the variable name size and in turn there is no easy/direct way for a application using efivars to know about any possible error encountered by efivars (what if EFI_DEVICE_ERROR occurs or a buggy firmware craps out like in this samsung case). So that would mean that on these buggy Samsung UEFI firmwares, a linux system will not be able to read/get any variables. It will appear has if no variables are there. INTURN the efibootmgr would potentially wrongly overwrite the first boot entry used by the uefi firmware boot logic for uefi firmware module or some other os or ...
NOTE: I haven't gone thro the efibootmgr code currently. If the efibootmgr is not reading the kernel error logs to cross check for any warning messages related to get_next_variable, then it ideally should be updated to cross check for it. Also if efibootmgr (or programs which use it) is currently not treating a empty efivars file (in turn boot entry) listing has a possible corner case to be handled specially by cross checking with the system user rather than blinding writing to 1st boot entry, then it should be fixed to handle this corner case in a proper user controlled manner.
POSSIBLE_SOLUTIONFOR_C) If noefi kernel parameter is passed, then as efivars won't be loaded, efibootmgr shouldn't trigger this corner case. However one will be required to handle the installation and suitable configuration of the uefi boot entries to allow linux booting on their own. Which will involve at a minimum copying of the Grub2 efi bootloader to the EFI partition, and in case secure boot is required then even the signed shim by Matthew Garrett or the new signed loader from Linux foundation will require to be copied to EFI partition. Also secure boot will require the equivalent/proper configuration of the Grub2/elilo/linux kernel (signed/hash) to let the secure boot to continue.
1.1.7 hanishkvc (hanishkvc) wrote on 2013-03-02: #166
And just for completeness of info in my last post, use of refind boot manager could be a usefull thing for people manually installing/configuring the boot process in the uefi setup. However I dont think by default it allows the secure boot chain to be passed down to the kernel, unless one uses/installs ones own PK (I think called custom boot mode at uefi firmware level) and suitable signing OR hash configuration in case of Linux foundations signed loader.
.
>> 2 << Trying to use a signed loader with Secure boot
I decided on trying the Linux foundations PreLoader rather than Matthew Garretts Shim loader, because the Preloader is supposed to register/hook its Image validation logic into the UEFI and allow subsequent modules in the secure boot chain to use its validation service with out requiring custom patches (Shim as it stands now Feb 2013, requires that modules chain loaded by it will require to use a custom protocol implemented by it and not the standard uefi protocol).
Also the LF Preloader depends on a simple hash based mechanism to decide whether a new image should be chain loaded during secure boot or not. And the bundled HashTool.efi is automatically called to allow the loader.efi (which should be the actual boot manager -gummiboot in my case) and subsequent modules (like Say a EFI Shell, Linux Kernel, Custom EFI app, etc) hash to be calculated and registered with Preloader validation mechanism. Also one can manually call HashTool.efi to register additional EFI apps as required.
Also the bundled KeyTool.efi allows one to make a backup of a given systems PK, KEK, db and dbx databases.
Also I decided to go with gummiboot as the boot manager, because it is open source and is also a simple minimal logic. So easier to customize or debug or …. And also because it provides the option for showing a Boot menu with multiple entries along with Default execution after a timeout as well as option to break the boot and force the boot menu display, if a key is pressed.
2.1 http://blog.hansenpartnership.com/linux-foundation-secure-boot-system-released/
The git source for the Linux foundation preloader is part of the efitool and is
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/efitools.git
2.1.1 HanishKVC - 05 March 2013 at 20:54
Hi jejb,
Tried running the usb image given above on a Samsung 300e5c with Windows 8 and Secure boot enabled in two ways
a) Running the USB image from USB (by selecting USB – HDD from win8 advanced restart)
b) By updating the ESP on the internal Harddisk. Where I created a EFI\LinFnd directory which contains the Preloader.efi, HashTool.efi and loader.efi (Gummiboot). And inturn changing the boot sequence using a combination of bcfg from efi shell and bcdedit (as win8 was reseting the efi boot order independent of the bios/efi firmware boot sequence).
However in both cases, found that hashtool.efi doesn’t seem to be successfully registering the hashes or the low level security hooking is not working, because I keep getting the failed image verification error for loader.efi and shellx64.efi (But both do run after telling failed verification). However the windows 8 bootmgfw.efi which I have added as a entry into gummiboot doesn’t give this failed image verification.
NOTE: Gummiboot seems to be behaving bit strangely wrt how it tries to pick the default boot entry (by ignoring its own loader.conf file wrt timeout as well as default entry and ignoring keypresses to allow menu selection – And I have to create a lot of dummy boot entries for gummiboot for it to fail to find the file to run and then inturn show its boot menu). Either way I have downloaded the source for gummiboot and will have to debug this seperately in the next few days, as I find time. As I can experiment on this with out requiring to get it signed by microsoft, due to your wonderful preloader.efi.
If you want me to do some experiments wrt preloader or give you more info on any aspects, do let me know either thro the forum here or thro my email id.
2.2 Getting Gummiboot to work properly - 07 Mar 2013
2.2.1 Starting out
The Gummiboot bundled with Linux foundations Signed Preloader usb image, wasn’t working properly on my system wrt picking of the default entry to boot, as well as wrt timeout or keypress to provide the gummiboot boot menu.
So I downloaded the latest gnu-efi and gummiboot sources from following locations to experiment on my own.
git source repos for
gnu-efi = git://git.code.sf.net/p/gnu-efi/code
gummiboot = git://anongit.freedesktop.org/gummiboot
s1) I got the gnu-efi to compile and install with out any problem.
s2) But on trying to get gummiboot to compile (running configure) I found that
it was expecting gnu-efi in /usr directory rather than /usr/local directory - So I recompiled and installed gnu-efi to /usr directory by updating the Make.default file of gnu-efi. I did this So that I don’t have to go and modify configure.in, Makefile.am, etc files for projects depending on gnu-efi.
s3) Next on compilation of gummiboot, I found that it was expecting docbook-xsl related packages to be installed for the generation of its man page. (NOTE: 09Mar2013: Latest Gummiboot code takes care of this, based on the feedback provided)
s4) Next it compiled but failed during linking - Well initially it appeared to be partly failing, but succeeding on rerunning make. However on further debugging found that it was failing a check after the linking stage, which was trying to cross check if there were any undefined symbols. This is because here the efi application is prepared by using objcopy over a shared library, so implicit checking for unresolved symbols won’t run ( and also at run time, there is no full fledged loader with cross checking logic.) (NOTE: 09:Mar2013: Latest Gummiboot code takes care of this, based on the feedback provided).
The two symbols which it couldn’t resolve were related to stack protection and uefi_call_wrapper
s4.1) stack protection - As the latest gcc compilers seem to enable stack protection logic by default, the gnu-efi had got compiled with stack frame protection in the libefi.a However this requires that there is a additional code to validate the stack frame protection, but has gummiboot doesn’t have any code to cross check the same and also as gnu-efi will be used for making efi apps, which don’t have a full fledged runtime environment with compile, load and runtime environments supporting the same. I decided to recompile gnu-efi by passing -fno-stack-protector by updating the cflag entry in Make.default of gnu-efi.
s4.2) uefi_call_wrapper - This turned out to be a bigger experiment path than what I had initially hoped for.
On looking into gnu-efi source, found that how this gets materialised will depend on whether GNU_EFI_USE_MS_ABI is defined or not. And as the sample apps in gnu-efi were using this and in turn working properly in both qemu as well as my Samsung UEFI based laptop, I decided to enable this define in gummiboot also by updating its Makefile.am.
Now the above two changes allowed me to compile gummiboot succefully but the resultant gummiboot efi app failed to run (i.e it hung the machine) in both qemu (with TianoCore OVMF firmware) and samsung laptop.
Also even thou there were some prints in the gummiboot source, I couldn’t see any on screen and wasn’t sure how the console out of UEFI was getting routed/impacted on the target along with its Text/graphics/text_graphics based boot menu. So decided to build a simplified cut down version of gummiboot to try and figure where it might be failing.
Started with just a simple Hello print efi app at first followed by identification of the Device path of the currently loaded/running efi image, and in turn the file path for the loader/gummiboot image. These versions of the minimal gummiboot went properly. So on adding the conf parsing related logic into the application followed by a simple dumping of the config list in gummiboot, found that conf files related to loader\entries folder weren’t getting displayed.
So decided to add a print to the logic which goes thro the files in the loader\entries directory, and found that it was finding the conf files.
So decided to add print to the logic which actually interprets the conf file and adds the corresponding data to the conf list. And kaboom the resultant efi app hung. I remove the 2 prints I had added and the efi app runs, I add the prints back the app hangs.
So I realised that the issue seems to be either some memory leak in gummiboot or gnu-efi or some subtle interplay related to have runtime environment is expected by EFI firmware and how it is setup by gcc+binutils and or the elf to efi and resultant patch work.
So I tried looking at the options passed to compiler and linker by gnu-efi test apps and tried to replicate the same for gummiboot to be on safe side. But still no luck.
2.2.2 Next Phase
Then thought of cross checking with the gummiboot developers to see if they had faced similar problems or had any feedback. Key Seivers came back telling me that currently there are issues with GNU_EFI_USE_MS_ABI and gnu-efi.
s5) So just to cross check that, that was the only problem, I decided to do some initial glance into this USE_MS_ABI thing.
Noticied that it was basically enabling ms_abi attribute for EFIAPI function marker/attribute define. And in turn if ms_abi is defined, then the efi related service function was being directly called, otherwise it was going thro the uefi_call_wrapper. The idea being the compiler can cross check that efi function is being called with proper arguments in one case (i.e when ms_abi is defined and uefi_call_wrapper is not used) and this cross check not occuring and a indirection occuring in the other case with uefi_call_wrapper being actually defined.
So I disabled the GNU_EFI_USE_MS_ABI in Make.default of gnu-efi and compiled it with out the ms_abi support. In turn I compiled the original gummiboot code and found that the resulting efi app is running with out any hang on my Samsung laptop.
2.2.3 Some additional updates to gummiboot
As the base gummiboot was working. So thought of fixing up some of the issues I encountered till now with it and or interests me.
As I am interested in gummiboot to also run on my Samsung laptop and in turn has it has the buggy EFI firmware which can brick when writing to efi variable store in some cases, I have made the efi_var_set_raw of gummiboot into a nop which returns EFI_SUCCESS with out modifying the efi var store. This should be fine in general except that one cann’t override the loader.conf setting at run time. Also one cann’t reboot into Firmware setting from with in gummiboot. (NOTE:09Mar2013: I have asked Kay about adding this option, but not sure if he will be interested in folding this into gummiboot for a new simple_gummiboot version along with the full fledged gummiboot).
I updated the configure.ac and Makefile.am of gummiboot to check for gnu-efi in both /usr and /usr/local and use it from the location where it is found rather than hardcoding to /usr (So that one doesn’t have to force gnu-efi to /usr or update Makefile and configure in gummiboot) (NOTE:09Mar2013: I have informed Kay about this simplification for end user, most probably he will be updating the logic in gummiboot related to this).
Also I have created a simpler menu_run logic, which uses a simple Print base menu, where one directly presses a key from ‘a’ to ‘z’ as required to load a given efi app/os. Also this doesn’t clear the screen so one can see some debug messages if required. (NOTE:10Mar2013: I will cross check with Kay on this, but not sure he will want this in gummiboot).
>> 3 << FOR LATER - 08Mar2013
3.1) I have to see what is the difference that is occuring between ms_abi being used in gcc and not used wrt the minimal gummiboot code which I had hacked up, where I am able to switch between a running and hanged efi app by just enabling or disabling two Print calls.
3.2) I have to see why LF PreLoader keeps giving the failed image verification error (but does continue with execution of the image after the message notification and I saying OK, which is the only option currently in the message by the way and which I ok for me for now). The strange thing is the first time a new loader is tried it gives the image verification failed and then brings up the HashTool, but subsequent times the image verification failed notification comes but the HashTool doesn’t come up, so may be it is finding that the hash is there is the database, but it is not matching (which would seem very strange) or some other subtle bug there. I have to try with the latest source (But not sure how it will work wrt secure boot chain, as I cann’t have this has the default Preloader loaded by UEFI Firmware, as I cann’t get my compiled preloader.efi to be signed by MS now, and in turn how the currently signed PreLoader handles stuff when passing control to a image validation failed efi app).
3.3) Misc
3.3.1) Note from GCC - http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
ms_abi/sysv_abi
On 32-bit and 64-bit (i?86|x86_64)-*-* targets, you can use an ABI attribute to indicate which calling convention should be used for a function. The ms_abi attribute tells the compiler to use the Microsoft ABI, while the sysv_abi attribute tells the compiler to use the ABI used on GNU/Linux and other systems. The default is to use the Microsoft ABI when targeting Windows. On all other systems, the default is the x86/AMD ABI.
Note, the ms_abi attribute for Microsoft Windows 64-bit targets currently requires the -maccumulate-outgoing-args option.