What hardware do I need?

Minimum and recommended hardware specifications for running CAST Imaging in production

General hardware requirements

These requirements are valid for both Microsoft Windows and Linux via Docker deployments:

  • Physical or virtual machine(s)
  • CPU
    • Minimum 1 CPU / 2 cores, e.g.:
      • Intel Core i5, 2.6 GHz
      • Intel Xeon, 2.2 GHz
    • Recommended 1 CPU / 4 cores, e.g:
      • Intel Core i7, 2.8 GHz
      • Intel Xeon, 2.6 GHz
  • RAM
    • Single machine as part of an enterprise/distributed deployment:
    • Standalone mode (all components on one machine):
      • 32GB RAM absolute minimum
  • Free disk space: 256GB minimum free disk space (SSD or equivalent - using storage/disk with high IOPS values (i.e. SSD disks or SANs configured with SSD) will achieve better performance) - see also File storage for more detailed information.
  • Host machines with fixed IP address / hostname recommended

TCP ports

The following TCP ports are required:

imaging-services

  • 2381: CAST Imaging Control Panel
  • 8090: CAST Imaging Gateway Service
  • 8091: CAST Imaging Console Service
  • 8092: CAST Imaging Authentication service
  • 8096: CAST Imaging SSO Service
  • 8098: CAST Imaging Control Panel
  • 9002: CAST Imaging SSO Service

analysis-node(s)

  • 8099: CAST Imaging Analysis Node

imaging-viewer

  • 5000: Neo4j (not required in ≥ 3.4.0-funcrel)
  • 6372: Neo4j
  • 7483: Neo4j
  • 7484: Neo4j
  • 7697: Neo4j
  • 8070: CAST Imaging Viewer APIs (only required in ≥ 3.4.0-funcrel)
  • 8093: CAST Imaging Viewer Frontend
  • 8094: CAST Imaging Viewer AI Manager
  • 8284: CAST Imaging Viewer source code
  • 8285: CAST Imaging Viewer login
  • 9010: CAST Imaging Viewer Backend
  • 9011: CAST Imaging Viewer ETL

dashboards

  • 8097: CAST Imaging Dashboard Service

PostgreSQL for Microsoft Windows:

  • 2284

imaging-services

  • 2285: PostgreSQL
  • 2381: Control Panel
  • 8090: Gateway
  • 8091: Console
  • 8092: Authorization
  • 8096: Keycloak
  • 8098: Control Panel

analysis-node(s)

  • 8099: Analysis Node

imaging-viewer

  • 5000: Neo4j (not required in ≥ 3.4.0-funcrel)
  • 7473, 7474, 7687: Neo4j
  • 8070: API service (only required in ≥ 3.4.0-funcrel)
  • 8082: AI
  • 8083: Viewer front-end
  • 8084: Login
  • 9000: Viewer
  • 9001: ETL
  • 9980: Source code

dashboards

  • 8097: Dashboards

Communication architecture (Click to view larger image):

Click to view larger image

Example on-premises hardware configuration

Managing multiple applications

If you are managing a large number of applications, we recommend installing multiple nodes to spread the load. The disk space allocated to a single node obviously depends on the size and the number of applications that will be analyzed by the node. In a situation where all the nodes are running analyses, some nodes may need to run more than one analysis in parallel. To avoid overloading the node where more than one analysis is running at the same time, we strongly recommend deploying machines with sufficient resources.

Therefore, to analyze up to 50 applications and to run up to 5 analyses in parallel on one single node with one associated PostgreSQL instance, CAST recommends increasing RAM and DISK resources as follows:

Component RAM DISK
Node 32GB min 2TB (SSD recommended)
PostgreSQL 64GB min 3TB (SSD recommended)

Analyzing complex applications

While 90% of JEE, .NET or Mainframe applications can be analyzed with the minimum requirements, some specific (very large or not well balanced) applications require more memory than the minimum recommendations. The following configurations are examples of sizing required for very large applications. The requirements are not a linear function that are based purely on the number of files or lines of code (LoC), instead it is more complex and there is no specific formula to use.

In general, a lack of memory will cause slowness (machines will resort to the use of virtual memory) in the best case, and a crash in the worst case. The numbers presented in the table below are purely indicative and depict the varying memory requirements:

Application Node CPU Peak RAM Node - RAM (recommended minimum) Disk space
JEE application with 13,000 java files and 6,300 JSP files 2 processors, 4 cores 22 GB 32 GB 256 GB
JEE application with 21,000 java files, 14 JSP files, 2,800 projects As above 10 GB 16 GB As above
JEE application with 30,000 java files and 1,200 JSP files As above 12 GB 16 GB As above
.NET application with 18,785 C# files As above 12 GB 16 GB As above
.NET application with 23,000 C# files and 4,100 cshtml files As above 20 GB 32 GB As above

Kubernetes

Cluster requirements: base configuration (supports 1 analysis-node)

  • Kubernetes version: use the latest available in your Cloud environment
  • Node count: 2 nodes minimum required
  • Per-node CPU: 4 vCPUs (minimum)
  • Per-node RAM: 32 GB (minimum) — 64 GB (recommended) — 128 GB (for multiple complex/large applications)

Cluster scaling

To increase the concurrent analysis capability, the AnalysisNodeReplicaCount variable (values.yaml) can be incremented. For each additional analysis-node you deploy, add 1 cluster node.

Furthermore, to support higher workloads (required for multiple complex/large applications), you can choose to run the analysis node, Neo4j, and/or PostgreSQL pods on dedicated cluster nodes. Doing so requires additional cluster nodes. This is controlled by the BalancedAffinity.Enforce*Isolation options in values.yaml, which govern how pods are distributed across the cluster (note that EnforceBalancedAffinity must be set to true):

  • EnforceAnalysisNodeIsolation: true — each analysis pod will run on a dedicated cluster node
  • EnforceNeo4jIsolation: true — the Neo4j pod will run on a dedicated cluster node
  • EnforcePostgresIsolation: true — the embedded PostgreSQL pod will run on a dedicated cluster node

These isolation options will require additional cluster node, for example:

  • Minimum deployment:
    • AnalysisNodeReplicaCount: 12 cluster nodes minimum
  • Medium to large size deployment:
    • AnalysisNodeReplicaCount: 1 + EnforceAnalysisNodeIsolation: true + EnforceNeo4jIsolation: true3 cluster nodes minimum
    • AnalysisNodeReplicaCount: 2 + EnforceAnalysisNodeIsolation: true + EnforceNeo4jIsolation: true4 cluster nodes minimum
  • Extra large deployment (multiple complex/large applications):
    • AnalysisNodeReplicaCount: 3 + EnforceAnalysisNodeIsolation: true + EnforceNeo4jIsolation: true + EnforcePostgresIsolation: true6 cluster nodes minimum
  • Increase AnalysisNodeReplicaCount to support a higher number of concurrent analysis

Use the Util-ScaleUpAllWithBalancedAffinity.bat script to achieve optimal pod placement across the minimum number of nodes (run Util-ScaleDownAll.bat beforehand).

Cluster topology

All nodes should reside in the same Availability Zone. This also applies to the managed postgres instance in case you choose to use one.

Storage

On Kubernetes deployments, we use the default Persistent Volume Claim (PVC) sizes as defined in the CAST documentation. See Persistent Volume Claims (PVCs)external link for the per-component default values.

The storage classes used by default for these PVCs support online volume expansion, so volume sizes can be extended on the fly as you go when additional space is required.