Skip to content

CloudStack 4.22: prepareHostForMaintenance throws NPE when stale destroyed volume references removed storage pool #13124

@andrijapanicsb

Description

@andrijapanicsb

problem

Description

While attempting to place a KVM host into Maintenance mode in CloudStack 4.22, the maintenance operation failed with a NullPointerException.

The issue appears to occur when a VM has a stale/destroyed volume entry in the volumes table that still references a removed storage pool.

Instead of gracefully ignoring the stale volume metadata or returning a user-facing validation error, CloudStack crashes during maintenance preparation.

Environment

  • CloudStack version: 4.22
  • Hypervisor: KVM
  • Primary storage: NFS
  • Database: MySQL/MariaDB

Steps to reproduce

  1. Have a VM with:

    • one valid active ROOT volume
    • one stale/destroyed ROOT volume entry still present in the volumes table
  2. The stale volume references a storage pool which has already been removed.

  3. Attempt to place the host running the VM into Maintenance mode.

Example problematic DB state:

VM

SELECT id, uuid, name, instance_id, pool_id, state, removed
FROM volumes
WHERE instance_id = 446
ORDER BY id;

Result:

+------+--------------------------------------+----------+-------------+---------+---------+---------+
| id   | uuid                                 | name     | instance_id | pool_id | state   | removed |
+------+--------------------------------------+----------+-------------+---------+---------+---------+
|  928 | 6dea3d6f-bd6d-4e8b-9524-6e99c029694c | ROOT-446 |         446 |       4 | Destroy | NULL    |
| 1554 | 80acf9ae-b047-41b8-bded-cdceb6de7051 | ROOT-446 |         446 |       2 | Ready   | NULL    |
+------+--------------------------------------+----------+-------------+---------+---------+---------+

Storage pool state

The stale volume references storage pool ID 4, which is already removed:

storage_pool_name: Export-Domain
storage_pool_removed: 2026-04-23 13:44:08

Actual result

Host maintenance fails with:

java.lang.NullPointerException: Cannot invoke
"org.apache.cloudstack.storage.datastore.db.StoragePoolVO.isLocal()"
because "storagePool" is null

Relevant stack trace:

at com.cloud.vm.UserVmManagerImpl.isAnyVmVolumeUsingLocalStorage(UserVmManagerImpl.java:7558)
at com.cloud.vm.UserVmManagerImpl.isVMUsingLocalStorage(UserVmManagerImpl.java:7121)
at com.cloud.resource.ResourceManagerImpl.doMaintain(ResourceManagerImpl.java:1553)
at com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1653)
at org.apache.cloudstack.api.command.admin.host.PrepareForHostMaintenanceCmd.execute(PrepareForHostMaintenanceCmd.java:99)

Expected result

CloudStack should not throw an unhandled NullPointerException.

Possible expected behavior:

  • ignore destroyed/removed stale volumes during maintenance evaluation
  • skip volumes attached to removed pools
  • or return a proper validation error identifying the problematic VM/volume

Workaround

Marking the stale destroyed volume row as removed allowed maintenance to proceed:

UPDATE volumes
SET removed = NOW()
WHERE id = 928
  AND instance_id = 446
  AND state = 'Destroy'
  AND removed IS NULL
  AND pool_id = 4;

Additional notes

The issue appears to be triggered specifically by:

  • stale destroyed volume rows
  • still linked to an active/running VM
  • referencing removed storage pools
  • while evaluating VM local-storage usage during host maintenance

versions

The versions ACS 4.22, KVM (should not be relevant)

The steps to reproduce the bug

...

What to do about it?

No response

Metadata

Metadata

Assignees

Labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions