cuda.core: consolidate driver version query functions#2040
cuda.core: consolidate driver version query functions#2040leofang merged 8 commits intoNVIDIA:mainfrom
Conversation
Merge the two separate driver version query functions into a single get_driver_version() that returns (umd_version, kmd_version) — a pair of version tuples. UMD is a 2-tuple (major, minor) and KMD is a 3-tuple (major, minor, patch). The function now requires NVML. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ok to test bfd9ab9 |
|
Windows driver version strings only have two components, so nvmlSystemGetDriverVersion returns X.Y on WSL instead of X.Y.Z. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ok to test d2d387e |
|
Nit: the file-level comment at |
rparolin
left a comment
There was a problem hiding this comment.
The changes are correct but some added context would make it obvious to reader why its correct. I've labeled them as nit to not block this PR.
|
Let me wait for @mdboom blessing before merging |
There was a problem hiding this comment.
I don't think we can make NVML a hard requirement, because it means we now require cuda_bindings >=12.9.6 or cuda_bindings >= 13.2.0. Don't we make broader guarantees in general for that?
To resolve that, maybe we just have two separate functions for kernel and cuda version, but remove the "_full" distinction (which was just there for backward compat).
…rsion Per review feedback, split the consolidated get_driver_version() into two separate functions so each returns a simple tuple[int, ...]: - get_user_mode_driver_version(): works with or without NVML (falls back to cuDriverGetVersion), returns (major, minor) - get_kernel_mode_driver_version(): requires NVML, returns (major, minor, patch) or (major, minor) on WSL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ok to test f57003a |
rwgk
left a comment
There was a problem hiding this comment.
Ah, looks like I was too slow.
| "get_kernel_mode_driver_version", | ||
| "get_num_devices", | ||
| "get_process_name", | ||
| "get_user_mode_driver_version", |
There was a problem hiding this comment.
For the cuda-pathfinder guard rails that I'm adding in pending PR #1977, I need both functions, too:
Currently I have different naming. It might be good to keep the names aligned.
I was going with what nvidia-smi is showing, but that's changing. In the future it'll look like this:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI NNN.NN KMD Version: NNN.NN CUDA UMD Version: NN.N |
+-----------------------------------------+------------------------+----------------------+
So
get_driver_kmd_version
get_driver_cuda_umd_version
would be most aligned.
Summary
Consolidate
get_driver_version()andget_driver_version_full()into a singleget_driver_version()that returns(umd_version, kmd_version)— a pair of version tuples (UMD is a 2-tuple(MAJOR, MINOR), KMD is a 3-tuple(MAJOR, MINOR, PATCH)).The function now requires NVML and raises
RuntimeErrorif it is not available.kernel_modeparameter and the separateget_driver_version_fullfunction