Skip to content

Fix NodeExpandVolume#493

Open
ArielYehezkely wants to merge 3 commits into
Azure:mainfrom
ArielYehezkely:fixVolumeExpansion
Open

Fix NodeExpandVolume#493
ArielYehezkely wants to merge 3 commits into
Azure:mainfrom
ArielYehezkely:fixVolumeExpansion

Conversation

@ArielYehezkely

Copy link
Copy Markdown
  • Fix NodeExpandVolume to continue to actually call lvextend when the requested capacity is larger than the current PV capacity to enable exapnsion.
  • Advertise PluginCapability_VolumeExpansion_ONLINE so the external-resizer sidecar drives NodeExpandVolume per CSI spec.
  • After a successful NodeExpandVolume, patch pv.Spec.CSI.VolumeAttributes[CapacityParam] with the new size via a client.MergeFrom patch so failover re-provisioning recreates the LV at the expanded size.
  • Add unit tests and e2e tests for expanding PV.

@jmclong

jmclong commented May 27, 2026

Copy link
Copy Markdown
Contributor

Thanks for adding so many tests and thanks for thinking too of the case where we need to rehydrate the expanded PV.

Can you also change controllerExpansion to true in test/external/drivers/lvm.yaml? That will enable a few more test that will do volume expansion in external e2e.

controllerExpansion: false

Comment thread test/e2e/lvm_expansion_test.go Outdated
@ArielYehezkely ArielYehezkely changed the title Fix NodeExpandVolume and refresh PV capacity Fix NodeExpandVolume May 27, 2026
Comment thread internal/csi/node/node.go Fixed
@ArielYehezkely

ArielYehezkely commented May 27, 2026

Copy link
Copy Markdown
Author

@jmclong The E2E tests revealed a problem that might require a wider change - the capacityParam is an immutable field which cannot be patched, this mean that re-provisioning after expansion will create the PV with the original size from creation time, I don't know if this is a blocker for the current change.
You might consider using annotations instead which are mutable but I'm leaving this to another PR.

Comment thread internal/csi/core/lvm/node.go
Comment thread internal/csi/core/lvm/node.go Outdated
Comment thread internal/csi/core/lvm/node.go Outdated
Comment thread internal/csi/node/node.go
}
pv.Annotations[lvm.ExpandedCapacityParam] = fmt.Sprint(resp.GetCapacityBytes())
if err := ns.k8sClient.Patch(ctx, &pv, client.MergeFrom(original)); err != nil {
log.Error(err, "failed to patch PV with expanded capacity annotation")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we return an error here since we know the PV exists and we would otherwise rehydrate with the wrong capacity.

Comment thread internal/csi/node/node.go
@@ -587,7 +603,21 @@ func (ns *Server) NodeExpandVolume(ctx context.Context, req *csi.NodeExpandVolum
return nil, status.Error(codes.Internal, err.Error())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are swallowing the errors that we create in the volume code, resulting in always returning internal. We can return the correct error type from resp. Otherwise I think we can get in a loop where nodeexpandvolume will keep being retried in some situations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants