Mastodon

Converting Windows VM from VirtualBox to virt-manager

Send to Kindle

This is going to be my typical long, rambling post. I might have a TL;DR section way at the bottom with bullet points that eventually worked for me, but who knows. If you don’t care about deep techie stuff, definitely just skip this post (trust me…).

I recently blogged about VirtualBox and virt-manager and Clonezilla. I’ve mentioned VMs and Clonezilla in another post or two as well. In many of those, I’ve also mentioned my current 9-year-old laptop (which still runs Arch Linux like a champ). I’ve been running Linux full-time on my laptop and all of my servers (currently eight between two locations that I control and two cloud-based servers).

So, why am I writing about Windows? Because roughly a dozen times a year, I need to fire up a Windows desktop, almost exclusively to run TurboTax (and on occasion, an ancient version of QuickBooks). Those dozens of times occur between February and April each year, and then Windows often remains dormant until the following February…

I switched to Linux from Windows. At the time, I was running a retail copy of Windows 7. I can’t remember if I virtualized that working version of Windows 7, or, more likely, I deactivated it, and installed Windows 7 in a VM from the ISO and used the same retail product key to activate it. In either case, I had an activated Windows 7 VM running in VirtualBox on my Linux-based laptop. I’ve had a glitch or two, here and there, over the 11 years that I’ve been running it that way, but for the most part, it has been a pleasant and stable experience.

At some point, a number of years ago, Microsoft offered a free upgrade to Windows 10 from Windows 7. There was a theoretical Window of opportunity to take advantage of the upgrade (though it worked for quite a while longer for many people). I didn’t really need any feature change in Windows 10, but with the upgrade would come longer maintenance windows, so I upgraded the VM, and it went very smoothly (or so I recall).

So, for years now, I’ve been running Windows 10 in the VM. At some point, I tried to convert the VDI file (native VirtualBox file format) to qcow2 (native QEMU file format). I mentioned in a previous post that I was led to believe that QEMU would run more efficiently and that I also wanted to control my VMs remotely. I made two attempts at conversion:

  1. using qemu-img to convert directly from vdi to qcow2. I don’t think the resulting Windows 10 VM even booted (at least I remember giving up on it very quickly).
  2. using qemu-img to convert from vdi to raw, and then from raw to qcow2. This was the recommended way in a number of blogs that I read.

I’m also a little fuzzy on what happened in scenario 2, but I believe that the VM booted, but was slower or klunkier than the VirtualBox version. Since this wasn’t a necessary conversion, I was fine abandoning the effort and continuing to run the VM under VirtualBox. Well, I was fine practically, but my ego was definitely bruised. At the time, I had bigger fish to fry.

Coming back to the previous posts that I linked above, I realized that there was a third way to try and migrate from VirtualBox to QEMU. Instead of converting the disk, I could use the technique that I used for the Acer laptop that my goddaughter donated (through me).

I attached the Clonezilla ISO to the Windows 10 VirtualBox VM and booted off the ISO instead of the Windows disk. I attached a file folder on my laptop to the VM as well. Then I ran a Clonezilla disk to image backup, dumping the entire Windows 10 disk (all partitions, including boot) to the image file saved on my laptop.

I copied the img directory from my laptop to my Topton N5105-based server described in detail in this post. I created a new VM in virt-manager. I gave it a bigger disk and attached the Clonezilla ISO as the boot drive and the folder containing the img directory as another read-only disk. I booted up Clonezilla, restored from image to disk and shut down the VM. I detached the Clonezilla ISO and the shared folder and booted from my new Windows 10 VM.

It booted up perfectly. Given the hassle I had with the conversion attempts from vdi to qcow2, this felt magical. It seemed that Clonezilla was a more reliable way to migrate a VM from one format to another.

The original vdi version was roughly a 70GB Windows disk. That was plenty (recall that I only use it for filing my taxes) and I still had 20GB free. But, since the Topton server has a 2TB NVMe drive, I created the new VM with a 256GB disk, because why not? Of course, Windows didn’t know about the extra space, since I simply restored the old disk image into the new disk.

This is (typically) a simple fix in Windows. Fire up Disk Management and extend the C partition to gobble up the unused space to the right of it. Unfortunately, that wasn’t possible for two reasons:

  1. There was 1MB of unused space (unallocated) immediately to the right of the C drive.
  2. There was an 851MB Recovery Partition to the right of that, followed by the new unallocated space.

That wasn’t going to be straightforward to deal with in Windows, especially in a running VM. That said, this is where VMs (in general) shine. You can make many changes and reboot as many times as you need, while still having your normal full work environment available to you.

I shut down Windows and attached a Gparted ISO (a Linux partitioning tool). I booted off of that ISO and still had the Windows 10 disk attached to the VM. I moved the Recovery Partition to the very end of the disk. That created all of the unallocated space between the C drive and the hidden Recovery partition. Then I shut down the VM and detached the Gparted ISO.

I rebooted into Windows and fired up the Disk Management tool again. This time there was plenty of unallocated space immediately adjacent to the C drive. I right-clicked on the C drive and selected extend volume and voila, the C drive was now something like 254GB. Victory.

That concludes the smooth part of my journey, and if all you wanted/needed to do was a one-to-one conversion (including even resizing the disk), you’re done, and can skip the rest of the heartache that I am about to describe…

A long-ish digression is necessary. In the previously linked post about the N5105 chip, I mentioned that the tiny RINGREAT box came with Windows 11 Pro pre-installed. I never intended to run Windows on that box, but I essentially paid for a Windows 11 Pro OEM license, whether I liked it or not. In order to preserve the possibility (and right!) to turn that box back into a legitimate Windows 11 Pro machine, I backed up the entire disk using Clonezilla, before I installed Arch Linux on it.

With my newfound success reconstituting Windows 10 via Clonezilla, I figured it couldn’t hurt to try doing the same with my backed-up copy of Windows 11 Pro. I had seen my first real Windows 11 box (not Pro) on the Acer. While I didn’t think that it came with any significant updated features, I did like the updated look somewhat (even if it was mostly eye candy). I couldn’t (and wouldn’t!) reuse that backup, since I gave the Acer to someone to use as a Windows laptop. But, this Windows 11 Pro was not going to be used by anyone but me.

I created a new VM (with Clonezilla attached as well) and restored the Windows 11 Pro disk. The restore went smoothly (of course!), but Windows wouldn’t boot. The reason was simple. The backed-up disk was set up to boot with UEFI, and I created the VM with BIOS legacy boot. Simple enough to fix. I deleted the VM and created a new one with UEFI boot. I didn’t need to do a Clonezilla restore again, just attach the disk where I had already restored Windows 11 Pro. This time Windows 11 Pro booted successfully! Needless to say, I was pretty pleased with myself.

I did all of the updates (always a pain on Windows, since they are sequential, so you never know when you will finally get the up to date message). I didn’t install any new software on it (nor move over any of my existing installation of TurboTax, etc.).

What I was most interested in was to see whether controlling the VM remotely, from my laptop, would be as responsive as it was running it on my laptop (under VirtualBox). Windows 11 Pro (and Windows 10 Pro for that matter), have a built-in RDP server (Home versions of Windows only have the client version of RDP installed).

I turned on RDP and installed Remmina (a multi-protocol client) on my laptop, and connected to the remote (but still over a local LAN) Windows 11 Pro VM. The experience was awesome. Every bit as pleasant as running it on my laptop. I could easily pop into full screen mode (and get the full 1920×1080 experience!), or set the window to any size I wanted. Mission accomplished.

At this point, I assumed that after this tax season (April), I’d migrate my files over from my new Windows 10 VM to this Windows 11 Pro VM. I’d get the benefit of Windows 11, and the RDP server in Pro. Wrong. A few days later, when I fired up the Windows 11 Pro VM just to muck around, I got the dreaded Windows needs to be activated watermark. While I had a legitimate license (this wasn’t pirated, nor running on any other machine), it also wasn’t supported (or even allowed) by Microsoft, since this was an OEM license, not a Retail one.

It’s possible that if I had thought to deactivate the license before I wiped it off of the RINGREAT box, that I would have been able to activate it under the I changed my underlying hardware exception. I’ll never know, since I didn’t deactivate first (which means I can restore it to that box in the future), and it certainly did not activate with the changed hardware exception.

End of digression

Having enjoyed the updated look of Windows 11, I made the mistake of attempting to upgrade my legitimate Windows 10 (new VM) to Windows 11 (a free upgrade). I knew I wouldn’t have Pro features available (of which only RDP made me a bit sad), but there are many alternatives to RDP. The single biggest reason to upgrade was the same reason I upgraded from Windows 7 to Windows 10 (even though I was perfectly happy with Windows 7!). Namely: Future Maintenance and Support.

Even though Windows 11 is a free upgrade to Windows 10, in order to upgrade, you need to meet a number of criteria. You can download a free app from Microsoft called PC Health Check which will analyze your system and tell you whether you are good to go or need some changes/upgrades. In my case, there were four changes necessary to make my machine (virtual in this case) eligible for the upgrade:

  1. The machine needed to be booted by UEFI, not legacy BIOS. I noted above that the Windows 11 Pro required a VM that booted in UEFI mode.
  2. The machine needed to support Secure Boot (which is why UEFI is required to begin with).
  3. The machine has to support TPM (Trusted Platform Module) version 2.0.
  4. The machine has to have 4 (or more) CPU Cores. The trick is that for non-Pro versions of Windows, they have to be in a single socket!

My VM had none of those, hence the failure to upgrade. A sane person would have just quit there, and continued to use Windows 10 (which worked fine for years) until it was no longer supported, and then made some decision as to how to proceed. I’ve never claimed to be sane, especially when it comes to technology. And, this is the Year of Tech Projects, so might as well add upgrade to Windows 11 to the list…

Welcome to my multi-day hell. There is a happy ending, but I put in at least 12 hours (possibly as much as 20), suffering one setback after another, in pursuit of this non-essential goal. To be clear, the overwhelming majority of issues were caused by me, either by stupidity, carelessness, a previous decision that had consequences on this upgrade and simply not taking my time to really think through an error message or a process.

That said, while the mea culpa is real (I seriously take the blame), I had to Google dozens of phrase and error messages in dozens of articles (both Microsoft and independent places) to stitch together the final working solution. No article had even a remotely comprehensive way to avoid the hell that I went through (hence my willingness to document this for myself and posterity, not that anyone will find this when they need it).

When you search for “change bios to uefi on Windows 10” the answers are all consistent. It turns out there are two basic ways to do it (for free). The Microsoft way (which is what I would have preferred to use) is a program called mbr2gpt (GPT being the GUID Partition Table way of slicing up a hard disk). The safest way to run mbr2gpt is by booting off of a Windows installation ISO and going into the Repair section rather than the Installation itself.

Fine. As with my previous tweaks, attaching a Windows 10 ISO to the VM was trivial, as was booting off of it. Running mbr2gpt was trivial as well. Getting it do the conversion, not so much… mbr2gpt complained about the disk layout, without telling me what was wrong. I tried every suggestion I could find to manually correct the error. All failed.

I finally decided to try the second basic way (the non-Microsoft way). The non-Microsoft way (the Linux way) appears way easier (note the word appears), so I was a little annoyed at myself for not trying it first.

The Linux way was to boot off of a Live ISO and type a single command: gdisk /dev/sda (substitute your disk for /dev/sda if it’s something else). gdisk does a single thing. It tells you what the partition and boot table looks like on your disk, and converts an MBR one to GPT. It does the conversion in RAM. If you press the w key, it will save that conversion back to the real disk, completing the conversion. Simple enough!

Well, gdisk failed as well. I don’t recall if it was specific about why it failed, but I do know that searching for why gdisk fails yields the correct reason instantly, which is not the case with the failed mbr2gpt (which failed for the same reason!).

It turns out that GPT requires 16KB (yes, a very tiny amount of space) at the end of the disk to write a duplicate partition table out (I assume this is for some emergency recovery if the main GPT gets corrupted). Microsoft’s documents mention this (so kudos to them on that front), but why the heck can’t mbr2gpt tell me that this what failed? Also, why doesn’t the first search result with the exact error message from mbr2gpt suggest that this is the failure (like it does for gdisk)?

Recall that I said earlier that I caused a number of the problems? Well, when I moved the Recovery partition in order to grow the C drive, I moved it all the way to the end of the disk, and didn’t leave even 16KB for this new GPT. Oops. Once I knew what the issue was, the solution was simple. I booted the VM with Gparted again, and moved the Recovery partition 1MB to the left, leaving 1MB free at the end of the disk.

Then I booted again with a Live ISO and ran gdisk, successfully. I now had a GPT-based disk, instead of an MBR one. Step 1 complete (or so I thought).

Stupidly, I simply rebooted without any ISO (virtual CDs) attached, and assumed that Windows 10 would now boot with UEFI and GPT. Bzzzt.

At least this time the boot error was reasonably helpful. The VM was still set to boot via legacy BIOS, so it didn’t know how to read the GPT formatted partition.

It turns out that changing a BIOS boot to UEFI boot in virt-manager is not allowed. It can be done, but not via the UI, and virt-manager is the UI. It can be done by editing the controlling XML file via a command line program called virsh. Even then, the correct syntax can be extremely tricky and very picky as well.

A much simpler solution (or so it would seem), would be to create a new VM (where you can easily pick a UEFI bootloader) and attach the existing Windows 10 disk to it (the one that was successfully converted from MBR to GPT). I did that, but Windows still wouldn’t boot. This time, the error message was less useful, but not useless, it’s just that I was not thinking clearly, this many hours into the pain of pursuing a non-essential conversion…

After a bunch of Googling, with little success, and a bunch of booting off of various ISOs (to check and recheck partition tables), I had to stop and think (duh). Staring at the partition table (for the umpteenth time), I noticed (finally) that the GPT partition was formatted as NTFS (native Microsoft format). From my many recent installations of Arch Linux, I knew that the UEFI partition had to be formatted FAT32 and be marked as an EFI BOOT partition type.

Back to booting off of the Gparted ISO. I formatted the GPT partition FAT32 and marked it as EFI BOOT. This time, I realized that I couldn’t just boot off of it, since it was now empty (that’s what the format command does!). I googled how to install (or repair) the Windows boot partition. This time the result was easy to find. I booted into the Windows 10 ISO again, back to Repair, and typed a few commands to copy over the boot files from the Windows folder on the C drive to the boot partition.

I rebooted into Windows 10, and voila, I had a working UEFI Windows 10 instance. I can now check off #1 from the list above of things that failed to pass the upgrade test.

I’m going to skip #2 for minute and jump to #3, TPM 2.0. That was trivial. virt-manager can emulate TPM (both versions), but this is a brand new machine, and it has a hardware TPM 2.0 module in it, and virt-manager can pass it through to a single VM at a time. I only need it on this VM, so I passed it through, and #3 was checked off the list.

Back to #2. Turning on secure boot turned out to be simple (in the end!), but it was a long and winding road (entirely of my making). This is another spot where Googling yielded so many different answers, most of which seemed to be reasonable. Most of those turned out either not to work at all (because they might have been outdated and referred to old KVM/QEMU syntax), or they (perhaps) applied to different configurations than mine.

That’s not what really hung me up though. My own first guess appeared to work. When you select UEFI boot in virt-manager, you are given the choice of a variety of different OVMF profiles to use. In my Arch installs, I used the one that was suggested, and it worked. However, when selecting from the list, I noticed that there were also secure versions of a number of the profiles available. My first stab was to edit the XML file via virsh, and add the word secure in the appropriate file path.

Like I said above, it appeared to work. In fact, Windows 10 booted, and running the PC Health Check app did accept that we successfully booted in secure mode. Huzzah! That only left the final step, needing 4 cores for the Windows 11 upgrade (Windows 10 was happy with the 2 cores that I gave it, virtually).

The Topton machine that I was running this on has 4 real cores (and no hyperthreading). It’s perfectly OK to give all 4 cores to a single VM, even when giving cores to other VMs, and knowing the the the underlying OS will be using cores for itself! So, I upped the core count from 2 to 4. Running PC Health Check showed that I failed the CPU Core test.

Googling (once again), and this time I found a short YouTube video that explained the problem and solution. I mentioned above that Home versions of Windows require all of the cores to be in a single socket. This video showed me how to manually override the socket settings and present them as 4 cores in a single socket. Excellent.

Would Windows 10 upgrade to Windows 11 now? Yes! Would that be the end of my hell? No, but we’re getting very close…

So, I booted back into Windows 10 and ran the PC Health Check and it passed all of the tests. I then tried to force a Windows 11 update, and it told me that it would eventually offer me an update (in other words, I couldn’t force it). The next morning, when I booted back into the VM and did a Check for Updates, it did indeed offer me the Windows 11 upgrade!

The upgrade succeeded and I was able to apply a number of updates to that as well. All was perfect (or so it seemed), until I hit update KB5012170 (which is a Windows update that affects security of the secure bootloader, by installing a revocation list into the secure bootloader).

It failed every time (thankfully, it only take 10 seconds to try and fail). Tons of posts on the net with the same issue. This took me a few hours of failed attempts at following every potential solution I could find. It turns out that the issue is that the older firmware versions of OVMF files only have 2MB of space in them. I was using one of those (from my default Arch installs). This Microsoft update required the newer 4MB version.

Once I realized that, it should have been a simple change. It wasn’t… I mentioned above that I got secure boot to work (and pass Microsoft’s test), but it was the wrong-sized secure boot. Changing it to the correct size was a very long series of trial and error edits in virsh. I finally found the right combination (which looked nothing like what I had originally had working).

I booted back into Windows and successfully updated KB5012170. Now I’m really done, right? Wrong.

UI performance was a little slow, and I couldn’t copy/paste from my laptop to the VM and vice versa. I installed the Spice Tools into the Windows VM. That solved both problems. Then I found a post that described how to share files with the host via Spice as well. I got that to work too. So, now I’m done, right? Wrong.

The next morning, I boot again to check things out. I get the dreaded Windows needs to be Activated warning. What? This is a retail license, that was upgraded all the way from Windows 7. I try the my hardware changed (which it did!). It fails. But, the error says something to the effect that Windows couldn’t reach the server and I should try again later…

I tried the next morning, and Windows activated successfully. Whew. Finally!

Now for a surprise bonus (for me, and potentially any other Linux users out there). Following the post that I linked to immediately above about sharing files, it turns out that the virt-viewer command works remotely (OK, so that part wasn’t a surprise). But, following the same instructions given in that post, allows you to share folders remotely, seamlessly, and that surprised the heck out of me.

I you recall, I always ran the Windows 10 VM on my laptop. So, sharing files between them was trivial. When I run the TurboTax software in the VM, I actually saved the resulting tax files on my laptop (directly from the VM, transparently). I thought with my new remote VM, I’d have to save the files on that host, and copy them over to my laptop. With virt-viewer, I was able to directly share the folder from my laptop to the remote VM, and the files get saved to the same spot on my laptop as they did when I ran the VM locally. Magic.

Also, using the Spice Protocol, virt-viewer is extremely smooth in the UI to the remote VM as well. Remmina does support Spice (as one of the protocols) but virt-viewer connects and shares the files more naturally.

I mentioned the possibility of a TL;DR. here it is.

  1. Make sure you have at least 16KB of free space at the end of your Windows disk unallocated (I left 1MB for extra safety).
  2. Run gdisk to convert the MBR to GPT (it’s more reliable than mbr2gpt, but you can try that first and hope for the best).
  3. Format the GPT partition to FAT32 and mark it as an EFI BOOT partition.
  4. Boot into the Windows ISO and select Repair. Then populate the GPT partition with the correct boot files with the following.
  5. You need to follow the setup in the previous step/article, but the main command is: bcdboot c:\Windows /s : /f ALL
  6. Then one more command: bootrec /rebuildbcd
  7. Follow the video showing you how to change the number of cores in a single socket (4 minimum).
  8. Turn on TPM 2.0 either emulated, or if you have the right server, passthrough.
  9. Make sure you have a 4m version of OVMF secure boot firmware selected.
  10. Profit

What an odyssey. At least it’s over and successful…


Posted

in

, , ,

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *