Hello Marc-Andre,
thanks for getting back to me so quickly.
On Wed, Dec 16, 2020 at 07:44:40PM +0400, Marc-André Lureau wrote:
> > IIRC I needed to activate swtpm in TIS and version 1.2 mode by trial and
> > error because that was the only combo that worked for Bitlocker. The
> Afaik, TPM 2.0 + CRB should be working. Which exact version of Windows is
> it? Did you make any upgrade?
No, the only change is in the qemu version used. I tried to add --tpm2
to the swtpm call and switch from tpm-tis to tpm-crb. Both send 5.1.0
into the Bitlocker recovery screen as well. Even if Bitlocker were to
Oh I meant when you setup the VM for the first time. You can't migrate from vTPM 1.2 to 2.0 indeed.
work with those, it's likely that the TPM can't just be switched over
but would need migration to retain its state so that Bitlocker would
continue to work. Also, I would like to keep the impact on the
productive VM in question to a minimum because it's quite a fickle
thing. And it *is* working with TIS and 1.2 on qemu 5.1.0 right now.
I see the problem with Windows 10 1909 (productive VM) and 20H2
(reproducer VM).
> > The following exact same commands have the machine booting when using
> > qemu 5.1.0 and end up in the Bitlocker recovery screen when using 5.2.0
> > or git HEAD:
> >
> > /usr/bin/swtpm socket
> > --ctrl type=unixio,path=11-win10-bitlocker-swtpm.sock,mode=0600
> > --tpmstate dir=bf566263-35e3-4dba-af8c-8ca85dba6a85/tpm1.2,mode=0600
> >
> > qemu-system-x86_64 -machine pc-q35-5.1 -m 4096
> > -uuid bf566263-35e3-4dba-af8c-8ca85dba6a85 -no-user-config
> > -blockdev
> > '{"driver":"file","filename":"win10-bitlocker.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}'
> > -blockdev
> > '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null}'
> > -device ide-hd,bus=ide.0,drive=libvirt-2-format,id=sata0-0-0,bootindex=1
> > -tpmdev emulator,id=tpm-tpm0,chardev=chrtpm
> > -chardev socket,id=chrtpm,path=11-win10-bitlocker-swtpm.sock
> > -device tpm-tis,tpmdev=tpm-tpm0,id=tpm0
The above command is missing a -vga qxl which seems to be required to
make Bitlocker reliably not go to recovery. It seems, it sometimes
accepts the video subsystem change to std until some other change
Yes, measured boot may be done at different stage, with different depth/policies I suppose.
happens. Yay. Here's an even shorter reproducer command that now seems
to reliably work for 5.1.0 but not for 5.2.0:
qemu-system-x86_64 -machine pc-q35-5.1 -m 4096
-uuid bf566263-35e3-4dba-af8c-8ca85dba6a85 -drive file=win10-bitlocker.qcow2
-tpmdev emulator,id=tpm-tpm0,chardev=chrtpm
-chardev socket,id=chrtpm,path=11-win10-bitlocker-swtpm.sock
-device tpm-tis,tpmdev=tpm-tpm0,id=tpm0 -vga qxl
> You have made great work in reporting the issue, would you be kind enough
> to do a git bisect ? That would be of great help!
I had tried already (and just now retried) but ran into problems
getting qemu to build after the third or fourth step. I gave up on git
bisect skip after the fifth or sixth try. It seems the switch to meson
left some amount of intermediate commits not building. Would that a fair
assessment or am I doing something wrong?
It would be quite unfortunate, we did a lot of effort to avoid intermediate breakage (from clean tree). But it's highly possible.
cd qemu
git bisect start v5.2.0 v5.1.0
./configure --target-list=x86_64-softmmu
make # (paralel build is broken as well)
build/qemu-system-x86_64 -machine pc-q35-5.1 -m 4096 ...
git bisect bad
git submodule update
you shouldn't have to do the submodule update, it's part of the qemu build system.
git clean -id
./configure --target-list=x86_64-softmmu
make
build/qemu-system-x86_64 -machine pc-q35-5.1 -m 4096 ...
... and so on ...
Looks good to me
(I tried an explicit out-of-source build first but that broke down some
steps into the bisect as well.)
If the change is only related to build-sys, git bisect skip is your friend. please hold-on :)
> (not much happened in hw/tpm tree between 5.1 and 5.2 that can easily
> explain this regression)
I have no idea how Bitlocker works but I have a feeling the TPM might
not be the (only) culprit here. The qxl vs. std hardware change
heuristic above seems to suggest that some amount of hardware change is
acceptable to Bitlocker when relying on just the TPM for boot. For more
extensive changes it seems to want additional authentication using
recovery keys entered by the user even with a working TPM before
reinstating the TPM-only boot (e.g. the uuid changing).
That's what prompted my initial question if there's a way to determine
guest-visible virtual hardware changes to see what is triggering
Bitlocker into recovery.
Could it be as easy as booting a recovery linux distro and comparing the
outputs of dmidecode or somesuch?
I am afraid it's not so easy. We should probably consider new tests to ensure version machines get the same PCR measurement from bios/uefi. That's what OS usually rely on, but they may probe more hardware related details themself too.
For giggles I did enter the recovery key of the testing VM when
prompted. It did boot up and showed Bitlocker enabled. After rebooting
it prompted for the recovery key again. I entered it again, it booted
again and I turned off Bitlocker (decrypted the disk). After re-enabling
Bitlocker (re-encrypting the disk) and rebooting again it now does not
prompt for the recovery key again.
This certainly seems to suggest that the TPM as such is working. TPM
management in Windows 10 said as much.
As said, I'd like to avoid this with the production VM and try to figure
out what's going on here to avoid it in the future.
I think our best chance is to bisect qemu. If you can't do it, I should be able to give it a try.