Migrate to FileStorage (RWX) for Imaging Kubernetes deployments

Applicable when upgrading to ≥ 3.6.0-funcrel

Overview

This guide describes how to convert the pvc-shared-datadir PVC used by the analysis-node from DiskStorage (ReadWriteOnce/RWO) to FileStorage (ReadWriteMany/RWX) on an existing CAST Imaging deployment on Kubernetes. This conversion is required as part of updating to CAST Imaging 3.6.x-funcrel when your deployment currently uses AnalysisNodeFS.enable=false. Once complete, continue with the standard Cloud services via Kubernetes update process.

This procedure MUST be applied BEFORE you update to 3.6.0-funcrel in all circumstances.

Applicability

This procedure applies only if all of the following are true:

You are upgdating from CAST Imaging < 3.6.0-funcrel to 3.6.0-funcrel or later
Your current deployment uses AnalysisNodeFS.enable=false

It applies whether or not you plan to use multiple analysis nodes in the future.

Helm chart version requirement: This procedure can only be applied to instances deployed with Helm chart 3.5.8 or 3.5.9. If you are using an older release, update to 3.5.8 or 3.5.9 before proceeding.

Important note

During this procedure, the analysis-node will be shut down and no analyses will be able to run. imaging-services, dashboards, and imaging-viewer will remain available throughout.

Prerequisites

A local machine with the kubectl tool installed and remote access to the Kubernetes cluster running your CAST Imaging instance
The examples below assume the namespace is castimaging-v3 - replace this with your actual namespace value

Conversion procedure

Step 1 - Prepare the migration

Run helm upgrade with the following values:

prepareMigrationToAnalysisNodeFS: true
AnalysisNodeReplicaCount: 0
AnalysisNodeFS:
  enable: false

This will:

Scale down the analysis-node
Create a new pvc-shared-datadir-mig volume (FileStorage)
Run a job to copy the current volume content to pvc-shared-datadir-mig

Transfer time: Expect approximately 200 MB/minute. This step may take a long time depending on the amount of data currently stored in your volume.

Step 2 - Verify the copy job succeeded

Check the status of the save-pvc-data-xxxxx pod - it should show Completed. To find its name:

kubectl get pods -n castimaging-v3

To view the log:

kubectl logs save-pvc-data-xxxxx -n castimaging-v3

If the job failed i.e. it does not show as Completed, follow these recovery steps:

a) Examine the pod status, log file, and cluster events to identify and fix the issue. One possible cause is a File Storage provisioning problem - check cluster events and provisioner logs.

b) Once the issue is resolved, delete the failed job, the migration PVC, and the storage class:

kubectl delete job save-pvc-data -n castimaging-v3
kubectl delete pvc pvc-shared-datadir-mig -n castimaging-v3
kubectl delete sc castimaging-fs

c) Re-run helm upgrade with the same values as in Step 1 and repeat from step a) above until the job completes successfully.

Cancelling the FileStorage migration: if an issue is reported in the log that you cannot immediately resolve and you need more time to decide how to proceed, you can cancel the conversion process and attempt it again later. See Cancelling the FileStorage migration process which explains how to do this.

Step 3 - Delete the old PVC

Risk of data loss: Do not delete pvc-shared-datadir until you have confirmed that the save-pvc-data job succeeded (pod status shows Completed) and you have fully reviewed the log produced by the save-pvc-data-xxxxx job (some warnings may be reported: ensure they can be safely ignored).

Once the save-pvc-data-xxxxx job has succeeded and any warnings/failures in the log have been resolved, delete the job and the old PVC:

kubectl delete job save-pvc-data -n castimaging-v3
kubectl delete pvc pvc-shared-datadir -n castimaging-v3

Step 4 - Restart the analysis-node

Run helm upgrade with the following values:

prepareMigrationToAnalysisNodeFS: true
AnalysisNodeReplicaCount: 1
AnalysisNodeFS:
  enable: true

This will restart the analysis-node using the new FileStorage PVC and copy the saved data from pvc-shared-datadir-mig to pvc-shared-datadir via an init-container.

Transfer time: This step is complete when the analysis-node pod shows a Running status. Expect approximately 200 MB/minute depending on the amount of data stored.

Step 5 - Global health check

Check pod statuses, open CAST Imaging, and confirm all services are up.

Step 6 - Cleanup

Run helm upgrade with the following values:

prepareMigrationToAnalysisNodeFS: false
AnalysisNodeReplicaCount: 1
AnalysisNodeFS:
  enable: true

This will delete the pvc-shared-datadir-mig PVC and perform a final restart of the analysis-node.

The FileStorage migration is now complete. Continue with the standard Cloud services via Kubernetes update process to finish updating to the new release.

Appendix

Cancelling the FileStorage migration process

When reviewing the save-pvc-data-xxxxx job logs as described in Step 2, if an issue is reported in the log that you cannot immediately resolve and you need more time to decide how to proceed, you can cancel the conversion process and attempt it again later. To do so:

Run helm upgrade with the following values (this will bring the analysis node back online and restore the initial state):

prepareMigrationToAnalysisNodeFS: false
AnalysisNodeReplicaCount: 1
AnalysisNodeFS:
  enable: false