Simplify instructions, no need to mess around with any uuids anymore

Fix typo
Add better examples for VRAM sizes
2023-01-15 13:34:59 +01:00 · 2023-01-04 22:52:39 +01:00 · 2023-01-04 21:57:20 +01:00 · 2022-12-28 12:17:40 +01:00 · 2022-12-28 12:06:13 +01:00 · 2022-12-04 15:38:44 +01:00
5 changed files with 179 additions and 102 deletions
--- a/510.85.03.patch
+++ b/510.85.03.patch
--- a/525.60.12.patch
+++ b/525.60.12.patch
--- a/README.md
+++ b/README.md
@ -1,14 +1,27 @@
-# NVIDIA vGPU with the GRID 14.2 driver
+# NVIDIA vGPU with the GRID 15.0 driver on Proxmox

-A few days ago, NVIDIA released their latest enterprise GRID driver. I created a patch that allows the use of most consumer GPUs for vGPU. One notable exception from that list is every officially unsupported Ampere GPU.
+In december 2022, NVIDIA released their latest enterprise GRID driver. I created a patch that allows the use of most consumer GPUs for vGPU. One notable exception from that list is every officially unsupported Ampere GPU and GPUs from the Ada Lovelace generation.
+
+> ## !!! YOUR RTX 30XX OR 40XX WILL NOT WORK AT THIS POINT IN TIME !!!

 This guide and all my tests were done on a RTX 2080 Ti which is based on the Turing architechture.

-### This tutorial assumes you are using a clean install of Proxmox 7.2, or ymmv when using an existing installation. Make sure to always have backups!
+### This tutorial assumes you are using a clean install of Proxmox 7.3, or ymmv when using an existing installation. Make sure to always have backups!

-The patch included in this repository should work on other linux systems with kernel versions 5.13 to 5.16 but I have only tested it on the current proxmox version.
+This guide should work for other linux systems with a recent kernel (5.15 to 5.19) but I have only tested it on the current proxmox version.
 If you are not using proxmox, you have to adapt some parts of this tutorial to work for your distribution.

+> # Are you upgrading from a previous version of this guide?
+>
+> If you are upgrading from a previous version of this guide, you should uninstall the old driver first:
+> ```
+> nvidia-uninstall
+> ```
+>
+> Then you also have to make sure that you are using the latest version of `vgpu_unlock-rs`, otherwise it won't work with the latest driver.
+>
+> Either delete the folder `/opt/vgpu_unlock-rs` or enter the folder and run `git pull` and then recompile the library again using `cargo build --release`
+
 ## Packages

 Make sure to add the community pve repo and get rid of the enterprise repo (you can skip this step if you have a valid enterprise subscription)
@ -44,7 +57,7 @@ git clone https://github.com/mbilker/vgpu_unlock-rs.git

 After that, install the rust compiler
 ```bash
-curl https://sh.rustup.rs -sSf | sh -s -- -y
+curl https://sh.rustup.rs -sSf | sh -s -- -y --profile minimal
 ```

 Now make the rust binaries available in your $PATH (you only have to do it the first time after installing rust)
@ -228,7 +241,7 @@ Depending on your mainboard and cpu, the output will be different, in my output

 ## NVIDIA Driver

-As of the time of this writing (August 2022), the latest available GRID driver is 14.2 with vGPU driver version 510.85.03. You can check for the latest version [here](https://docs.nvidia.com/grid/). I cannot guarantee that newer versions would work without additional patches, this tutorial only covers 14.2 (510.85.03).
+As of the time of this writing (December 2022), the latest available GRID driver is 15.0 with vGPU driver version 525.60.12. You can check for the latest version [here](https://docs.nvidia.com/grid/). I cannot guarantee that newer versions would work without additional patches, the patch in this guide works **ONLY** on 15.0 (525.60.12).

 ### Obtaining the driver

@ -236,17 +249,19 @@ NVIDIA doesn't let you freely download vGPU drivers like they do with GeForce or

 NB: When applying for an eval license, do NOT use your personal email or other email at a free email provider like gmail.com. You will probably have to go through manual review if you use such emails. I have very good experience using a custom domain for my email address, that way the automatic verification usually lets me in after about five minutes.

-The file you are looking for is called `NVIDIA-GRID-Linux-KVM-510.85.03-510.85.02-513.46.zip`, you can get it from the download portal by downloading version 14.2 for `Linux KVM`.
+The file you are looking for is called `NVIDIA-GRID-Linux-KVM-525.60.12-525.60.13-527.41.zip`, you can get it from the download portal by downloading version 15.0 for `Linux KVM`.
+
+![Video Tutorial to find the right driver](downloading_driver.mp4)

 For those who want to find the file somewhere else, here are some checksums :)
 ```
-sha1: 468912059ca86aaa737588c9b92a1f8bfaa071bd
-md5: bb330fa7f26e11bebeadefdee9c71e84
+sha1: e4147e1dcebfc5459759ea013b56bca1d30f3578
+md5: 0e2be7de643b99a62a1cca6ca37fd1ee
 ```

-After downloading, extract that and copy the file `NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run` to your Proxmox host into the `/root/` folder
+After downloading, extract that and copy the file `NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm.run` to your Proxmox host into the `/root/` folder
 ```bash
-scp NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run root@pve:/root/
+scp NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm.run root@pve:/root/
 ```

 > ### Have a vgpu supported card? Read here!
@ -255,8 +270,8 @@ scp NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run root@pve:/root/
 >
 > With a supported gpu, patching the driver is not needed, so you should skip the next section. You can simply install the driver package like this:
 > ```bash
-> chmod +x NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run
-> ./NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run --dkms
+> chmod +x NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm.run
+> ./NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm.run --dkms
 > ```
 >
 > To finish the installation, reboot the system
@ -270,25 +285,25 @@ scp NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run root@pve:/root/

 Now, on the proxmox host, make the driver executable
 ```bash
-chmod +x NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run
+chmod +x NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm.run
 ```

 And then patch it
 ```bash
-./NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run --apply-patch ~/vgpu-proxmox/510.85.03.patch
+./NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm.run --apply-patch ~/vgpu-proxmox/525.60.12.patch
 ```
 That should output a lot of lines ending with
 ```
-Self-extractible archive "NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm-custom.run" successfully created.
+Self-extractible archive "NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm-custom.run" successfully created.
 ```

-You should now have a file called `NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm-custom.run`, that is your patched driver.
+You should now have a file called `NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm-custom.run`, that is your patched driver.

 ### Installing the driver

 Now that the required patch is applied, you can install the driver
 ```bash
-./NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm-custom.run --dkms
+./NVIDIA-Linux-x86_64-525.60.12-vgpu-kvm-custom.run --dkms
 ```

 The installer will ask you `Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later.`, answer with `Yes`.
@ -297,7 +312,7 @@ Depending on your hardware, the installation could take a minute or two.

 If everything went right, you will be presented with this message.
 ```
-Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 510.85.03) is now complete.
+Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 525.60.12) is now complete.
 ```

 Click `Ok` to exit the installer.
@ -316,9 +331,9 @@ nvidia-smi

 You should get an output similar to this one
 ```
-Sun Aug  7 21:26:58 2022
+Sun Dec  4 12:54:59 2022
 +-----------------------------------------------------------------------------+
-| NVIDIA-SMI 510.85.03    Driver Version: 510.85.03    CUDA Version: N/A      |
+| NVIDIA-SMI 525.60.12    Driver Version: 525.60.12    CUDA Version: N/A      |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
@ -366,26 +381,16 @@ The output will be similar to this

 If this command doesn't return any output, vGPU unlock isn't working.

-### Bonus: working `nvidia-smi vgpu` command
-
-> ### Have a vgpu supported card? Read here!
->
-> If you have a card like the Tesla P4, or any other gpu from [this list](https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html), you should skip this section, as `nvidia-smi vgpu` is already working
->
-> You should continue reading at [vGPU overrides](#vgpu-overrides)
-
-I've included an adapted version of the `nvidia-smi` [wrapper script](https://github.com/erin-allison/nvidia-merged-arch/blob/d2ce752cd38461b53b7e017612410a3348aa86e5/nvidia-smi) to get useful output from `nvidia-smi vgpu`.
-
-Without that wrapper script, running `nvidia-smi vgpu` in your shell results in this output
-```
-No supported devices in vGPU mode
+Another command you can try to see if your card is recognized as being vgpu enabled is this one:
+```bash
+nvidia-smi vgpu
 ```

-With the wrapper script, the output looks similar to this
+If everything worked right with the unlock, the output should be similar to this:
 ```
-Sun Aug  7 21:27:04 2022
+Sun Dec  4 12:55:09 2022
 +-----------------------------------------------------------------------------+
-| NVIDIA-SMI 510.85.03              Driver Version: 510.85.03                 |
+| NVIDIA-SMI 525.60.12              Driver Version: 525.60.12                 |
 |---------------------------------+------------------------------+------------+
 | GPU  Name                       | Bus-Id                       | GPU-Util   |
 |      vGPU ID     Name           | VM ID     VM Name            | vGPU-Util  |
@ -394,15 +399,9 @@ Sun Aug  7 21:27:04 2022
 +---------------------------------+------------------------------+------------+
 ```

-To install this script, copy the `nvidia-smi` file from this repo to `/usr/local/bin` and make it executable
-```bash
-cp ~/vgpu-proxmox/nvidia-smi /usr/local/bin/
-chmod +x /usr/local/bin/nvidia-smi
+However, if you get this output, then something went wrong
 ```
-
-Run this in your shell (you might have to logout and back in first) to see if it worked
-```bash
-nvidia-smi vgpu
+No supported devices in vGPU mode
 ```

 ## vGPU overrides
@ -411,6 +410,14 @@ Further up we have created the file `/etc/vgpu_unlock/profile_override.toml` and

 If we take a look at the output of `mdevctl types` we see lots of different types that we can choose from. However, if we for example chose `GRID RTX6000-4Q` which gives us 4GB of vram in a VM, we are locked to that type for all of our VMs. Meaning we can only have 4GB VMs, its not possible to mix different types to have one 4GB VM, and two 2GB VMs.

+> ### Important notes
+>
+> Q profiles *can* give you horrible performance in OpenGL applications/games. To fix that, switch to an equivalent A or B profile (for example `GRID RTX6000-4B`)
+>
+> C profiles (for example `GRID RTX6000-4C`) only work on Linux, don't try using those on Windows, it will not work - at all.
+>
+> A profiles (for example `GRID RTX6000-4A`) will NOT work on Linux, they only work on Windows.
+
 All of that changes with the override config file. Technically we are still locked to only using one profile, but now its possible to change the vram of the profile on a VM basis so even though we have three `GRID RTX6000-4Q` instances, one VM can have 4GB or vram but we can override the vram size for the other two VMs to only 2GB.

 Lets take a look at this example config override file (its in TOML format)
@ -422,24 +429,21 @@ display_height = 1080     # Maximum display height in the VM
 max_pixels = 2073600      # This is the product of display_width and display_height so 1920 * 1080 = 2073600
 cuda_enabled = 1          # Enables CUDA support. Either 1 or 0 for enabled/disabled
 frl_enabled = 1           # This controls the frame rate limiter, if you enable it your fps in the VM get locked to 60fps. Either 1 or 0 for enabled/disabled
-framebuffer = 0x76000000  # VRAM size for the VM. In this case its 2GB
-                          # Other options:
-                          # 1GB: 0x3B000000
-                          # 2GB: 0x76000000
-                          # 3GB: 0xB1000000
-                          # 4GB: 0xEC000000
-                          # 8GB: 0x1D8000000
-                          # 16GB: 0x3B0000000
-                          # These numbers may not be accurate for you, but you can always calculate the right number like this:
-                          # The amount of VRAM in your VM = `framebuffer` + `framebuffer_reservation`
+framebuffer = 0x74000000
+framebuffer_reservation = 0xC000000   # In combination with the framebuffer size
+                                      # above, these two lines will give you a VM
+                                      # with 2GB of VRAM (framebuffer + framebuffer_reservation = VRAM size in bytes).
+                                      # See below for some other sizes

-[mdev.00000000-0000-0000-0000-000000000100]
+[vm.100]
 frl_enabled = 0
-# You can override all the options from above here too. If you want to add more overrides for a new VM, just copy this block and change the UUID
+# You can override all the options from above here too. If you want to add more overrides for a new VM, just copy this block and change the VM ID
 ```

-There are two blocks here, the first being `[profile.nvidia-259]` and the second `[mdev.00000000-0000-0000-0000-000000000100]`.
-The first one applies the overrides to all VM instances of the `nvidia-259` type (thats `GRID RTX6000-4Q`) and the second one applies its overrides only to one specific VM, that one with the uuid `00000000-0000-0000-0000-000000000100`.
+There are two blocks here, the first being `[profile.nvidia-259]` and the second `[vm.100]`.
+The first one applies the overrides to all VM instances of the `nvidia-259` type (thats `GRID RTX6000-4Q`) and the second one applies its overrides only to one specific VM, that one with the proxmox VM ID `100`.
+
+The proxmox VM ID is the same number that you see in the proxmox webinterface, next to the VM name.

 You don't have to specify all parameters, only the ones you need/want. There are some more that I didn't mention here, you can find them by going through the source code of the `vgpu_unlock-rs` repo.

@ -452,6 +456,88 @@ display_height = 1080
 max_pixels = 2073600
 ```

+### Common VRAM sizes
+
+Here are some common framebuffer sizes that you might want to use:
+
+- 512MB:
+  ```toml
+  framebuffer = 0x1A000000
+  framebuffer_reservation = 0x6000000
+  ```
+- 1GB:
+  ```toml
+  framebuffer = 0x38000000
+  framebuffer_reservation = 0x8000000
+  ```
+- 2GB:
+  ```toml
+  framebuffer = 0x74000000
+  framebuffer_reservation = 0xC000000
+  ```
+- 3GB:
+  ```toml
+  framebuffer = 0xB0000000
+  framebuffer_reservation = 0x10000000
+  ```
+- 4GB:
+  ```toml
+  framebuffer = 0xEC000000
+  framebuffer_reservation = 0x14000000
+  ```
+- 5GB:
+  ```toml
+  framebuffer = 0x128000000
+  framebuffer_reservation = 0x18000000
+  ```
+- 6GB:
+  ```toml
+  framebuffer = 0x164000000
+  framebuffer_reservation = 0x1C000000
+  ```
+- 8GB:
+  ```toml
+  framebuffer = 0x1DC000000
+  framebuffer_reservation = 0x24000000
+  ```
+- 10GB:
+  ```toml
+  framebuffer = 0x254000000
+  framebuffer_reservation = 0x2C000000
+  ```
+- 12GB:
+  ```toml
+  framebuffer = 0x2CC000000
+  framebuffer_reservation = 0x34000000
+  ```
+- 16GB:
+  ```toml
+  framebuffer = 0x3BC000000
+  framebuffer_reservation = 0x44000000
+  ```
+- 20GB:
+  ```toml
+  framebuffer = 0x4AC000000
+  framebuffer_reservation = 0x54000000
+  ```
+- 24GB:
+  ```toml
+  framebuffer = 0x59C000000
+  framebuffer_reservation = 0x64000000
+  ```
+- 32GB:
+  ```toml
+  framebuffer = 0x77C000000
+  framebuffer_reservation = 0x84000000
+  ```
+- 48GB:
+  ```toml
+  framebuffer = 0xB2D200000
+  framebuffer_reservation = 0xD2E00000
+  ```
+
+`framebuffer` and `framebuffer_reservation` will always equal the VRAM size in bytes when added together.
+
 ### Spoofing your vGPU instance

 #### Note: This only works on Windows guests, don't bother trying on Linux.
@ -476,41 +562,21 @@ After doing that, click the same id, it should open a new page where it lists th

 ## Important note when spoofing

-When I originally wrote this guide, the latest quadro drivers were from the R510 branch, but nvidia has since released multiple drivers in the R515 and R520 branch, those will **NOT WORK** and maybe even make your VM crash.
+You have to pick a Quadro Driver from the same driver branch, so in this case R525. Using newer drivers will **NOT WORK** and maybe even make your VM crash.

 If you accidentally installed such a driver, its best to either remove the driver completely using DDU or just install a fresh windows VM.

-The quadro driver for R510 branch can be found [here (for 512.78)](https://www.nvidia.com/Download/driverResults.aspx/189361/en-us/) or [here (for 513.46)](https://www.nvidia.com/download/driverResults.aspx/191342/en-us/). I've had the best results with 512.78 but the other could work too. But anything newer than that, will **NOT WORK**.
+The quadro driver for R525 branch can be found [here (for 527.27)](https://www.nvidia.com/Download/driverResults.aspx/196728/en-us/).
+
+## Drawbacks to spoofing
+
+- You do not have **ANY** CUDA support
+- It only works for Windows VMs
+- FRL (Framerate limiter) does not work, so no matter what settings you use for `frl_config`, it doesn't apply

 ## Adding a vGPU to a Proxmox VM

-There is only one thing you have to do from the commandline: Open the VM config file and give the VM a uuid.
-
-For that you need your VM ID, in this example I'm using `1000`.
-
-```bash
-nano /etc/pve/qemu-server/<VM-ID>.conf
-```
-
-So with the VM ID 1000, I have to do this:
-
-```bash
-nano /etc/pve/qemu-server/1000.conf
-```
-
-In that file, you have to add a new line at the end:
-```
-args: -uuid 00000000-0000-0000-0000-00000000XXXX
-```
-
-You have to replace `XXXX` with your VM ID. With my 1000 ID I have to use this line:
-```
-args: -uuid 00000000-0000-0000-0000-000000001000
-```
-
-Save and exit from the editor. Thats all you have to do from the terminal.
-
-Now go to the proxmox webinterface, go to your VM, then to `Hardware`, then to `Add` and select `PCI Device`.
+Go to the proxmox webinterface, go to your VM, then to `Hardware`, then to `Add` and select `PCI Device`.
 You should be able to choose from a list of pci devices. Choose your GPU there, its entry should say `Yes` in the `Mediated Devices` column.

 Now you should be able to also select the `MDev Type`. Choose whatever profile you want, if you don't remember which one you want, you can see the list of all available types with `mdevctl types`.
@ -519,7 +585,30 @@ Finish by clicking `Add`, start the VM and install the required drivers. After i

 Enjoy your new vGPU VM :)

-## Credits
+## Common problems
+
+Most problems can be solved by reading the instructions very carefully. For some very common problems, read here:
+
+- The nvidia driver won't install/load
+  - If you were using gpu passthrough before, revert **ALL** of the steps you did or start with a fresh proxmox installation. If you run `lspci -knnd 10de:` and see `vfio-pci` under `Kernel driver in use:` then you have to fix that
+  - Make sure that you are using a supported kernel version (check `uname -a`)
+- My OpenGL performance is absolute garbage, what can I do?
+  - Read [here](#important-notes)
+- `mdevctl types` doesn't output anything, how to fix it?
+  - Make sure that you don't have unlock disabled if you have a consumer gpu ([more information](#have-a-vgpu-supported-card-read-here))
+- vGPU doesn't work on my RTX 3080! What to do?
+  - [Learn to read](#your-rtx-30xx-or-40xx-will-not-work-at-this-point-in-time)
+
+## Support
+
+If something isn't working, please create an issue or join the [Discord server](https://discord.gg/5rQsSV3Byq) and ask for help in the `#proxmox-support` channel so that the community can help you.
+
+> ### DO NOT SEND ME A DM, I'M NOT YOUR PERSONAL SUPPORT
+
+When asking for help, please describe your problem in detail instead of just saying "vgpu doesn't work". Usually a rough overview over your system (gpu, mainboard, proxmox version, kernel version, ...) and full output of `dmesg` and/or `journalctl --no-pager -b 0 -u nvidia-vgpu-mgr.service` (<-- this only after starting the VM that causes trouble) is helpful.
+Please also provide the output of `uname -a` and `cat /proc/cmdline`
+
+## Further reading

 Thanks to all these people (in no particular order) for making this project possible
 - [DualCoder](https://github.com/DualCoder) for his original [vgpu_unlock](https://github.com/DualCoder/vgpu_unlock) repo with the kernel hooks
@ -529,7 +618,7 @@ Thanks to all these people (in no particular order) for making this project poss
 - [rupansh](https://github.com/rupansh) for the original [twelve.patch](https://github.com/rupansh/vgpu_unlock_5.12/blob/master/twelve.patch) to patch the driver on kernels >= 5.12
 - mbuchel#1878 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for [fourteen.patch](https://gist.github.com/erin-allison/5f8acc33fa1ac2e4c0f77fdc5d0a3ed1) to patch the driver on kernels >= 5.14
 - [erin-allison](https://github.com/erin-allison) for the [nvidia-smi wrapper script](https://github.com/erin-allison/nvidia-merged-arch/blob/d2ce752cd38461b53b7e017612410a3348aa86e5/nvidia-smi)
- LIL'pingu#9069 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for his patch to nop out code that NVIDIA added to prevent usage of drivers with a version >= 460 with consumer cards
+- LIL'pingu#9069 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for his patch to nop out code that NVIDIA added to prevent usage of drivers with a version 460 - 470 with consumer cards

 If I forgot to mention someone, please create an issue or let me know otherwise.

--- a/downloading_driver.mp4
+++ b/downloading_driver.mp4
--- a/12
+++ b/12
@ -1,12 +0,0 @@
-#!/usr/bin/bash
-
-for a in $*
-do
-  case $a in
-    vgpu)
-      export LD_PRELOAD="/opt/vgpu_unlock-rs/target/release/libvgpu_unlock_rs.so"
-      ;;
-  esac
-done
-
-exec /usr/bin/nvidia-smi $@
Author	SHA1	Message	Date
PolloLoco	0e51ef508e	Simplify instructions, no need to mess around with any uuids anymore	2023-01-15 13:34:59 +01:00
PolloLoco	dcf58742b8	Fix typo	2023-01-04 22:52:39 +01:00
PolloLoco	5ec737e1a3	Add better examples for VRAM sizes	2023-01-04 21:57:20 +01:00
PolloLoco	d1009fd47a	Add small video tutorial on how to obtain the right driver from the nvidia portal	2022-12-28 12:17:40 +01:00
PolloLoco	8cef2c6082	Fix typo (wrong version number)	2022-12-28 12:06:13 +01:00
PolloLoco	ea99035a5b	Fix 'malformed patch' error when applying the patch	2022-12-04 15:38:44 +01:00
PolloLoco	ba4b4b4787	Update guide to 15.0 - 525.60.12	2022-12-04 13:09:21 +01:00
PolloLoco	22bd687e6d	Update guide to 14.3 - 510.108.03	2022-11-24 22:15:30 +01:00