circle-nodesBench Clusters

Managing a Bench cluster

Introduction

Workspaces can have their own dedicated cluster which consists of a number of nodes. First the workspace node, which is used for interacting with the cluster, is started. Once the workspace node is started, the workspace cluster can be started.

The cluster consists of 2 components

  • The manager node which orchestrates the workload across the members.

  • Anywhere between 0 and up to maximum 50 member nodes.

Clusters can run in two modes.

  • Static - A static cluster has a manager node and a static number of members. At start-up of the cluster, the system ensures the predefined number of members are added to the cluster. These nodes will keep running as long as the entire cluster runs. The system will not automatically remove or add nodes depending on the job load. This gives the fastest resource availability, but at additional cost as unused nodes stay active, waiting for work.

  • Dynamic - A dynamic cluster has a manager node and a dynamic number of workers up to a predefined maximum (with a hard limit of 50). Based on the job load the system will scale the number of members up or down. This saves resources as only as much worker nodes as needed to perform the work are being used.

Configuration

You manage Bench Clusters via the Illumina Connected Analytics UI in Projects > your_project > Bench > Workspaces > your_workspace > Details.

The following settings can be defined for a bench cluster:

Field
Description

Separate Docker image for cluster manager and members.

When set to true, you can select a different Docker image for the cluster manager to do the orchestration and the cluster members to do the work.

Docker image

Available when separate Docker image for cluster manager and members is selected. Here, you select the Docker image of your cluster manager.

Web access

Enable or disable web access to the cluster manager.

Dedicated Cluster Manager

Use a dedicated node for the cluster manager. This reserves an entire machine (based on the selected resource model) for the cluster manager. If no dedicated cluster manager is selected, one core per cluster member is reserved for scheduling. For example, if you have 2 nodes of standard-medium (4 cores) and no dedicated cluster manager, then only 6 (2×3) cores are available to run tasks, because each node reserves 1 core for cluster management.

Resource model

Available when dedicated cluster manager is selected. This is the resource model on which the cluster manager will run.

Include ephemeral storage

Available when dedicated cluster manager is selected. Select this to create scratch space for your nodes. Enabling it makes the storage size selector appear. Data stored in this space is deleted when the instance is terminated. When you deselect this option, the storage size is 0.

Storage size

Available when ephemeral storage is selected. How much storage space (1 GB–16 TB) to reserve per node as dedicated scratch space, available at /scratch.

Docker image

Available when separate Docker image for cluster manager and members is selected. Here, you select the Docker image of your cluster members.

Type

Choose between Static and Dynamic.

Number of nodes / Scaling interval

For static, set the number of cluster member nodes (maximum 50). For dynamic, choose the minimum and maximum number of cluster member nodes (up to 50).

Resource model

The type of machine on which cluster members run. For each cluster member, one machine of this type is provisioned. Consider the cost impact when running many machines with a high individual cost.

Economy mode

Economy mode uses AWS Spot Instancesarrow-up-right. This halves many compute iCredit rates compared to standard mode, but instances can be interrupted. See Pricing for a list of supported resource models.

Include ephemeral storage

Select this to create scratch space for your nodes. Enabling it makes the storage size selector appear. Data stored in this space is deleted when the instance is terminated. When you deselect this option, the storage size is 0.

Storage size

Available when ephemeral storage is selected. How much storage space (1 GB–16 TB) to reserve per node as dedicated scratch space, available at /scratch.

Operations

Once the workspace is started, the cluster can be started at Projects > your_project > Bench > Workspaces > your_workspace > Details and the cluster can be stopped without stopping the workspace. Stopping the workspace will also stop all clusters in that workspace.

Managing Data in a Bench cluster

Data in a bench workspace can be divided into three groups:

  • Workspace data is accessible in read/write mode and can be accessed from all workspace components (workspace node, cluster manager node, cluster member nodes ) at /data. The size of the workspace data is defined at the creation of the workspace but can be increased when editing a workspace in the Illumina connected analytics UI. This is persistent storage and data remains when a workspace is shut down.

  • Project data can be accessed from all workspace components at /data/project. Every component will have their own dedicated mount to the project. Depending on the project data permissions you will be able to access it in either Read-Only or Read-Write mode.

  • Scratch data is available on the cluster members at /scratch and can be used to store intermediate results for a given job dedicated to that member. This is temporary storage, and all data is deleted when a cluster member is removed from the cluster.

Fast Read-Only Access

All mounts occur in /data/mounts/, see data access and workspace-ctl data.

Managing these mounts is done via the workspace cli /data/.local/bin/workspace-ctl in the workspace. Every node will have his dedicated mount.

For fast data access, bench offers a mount solution to expose project data on every component in the workspace. This mount provides read-only access to a given location in the project data and is optimized for high read throughput per single file with concurrent access to files. It will try to utilise the full bandwidth capacity of the node.

All mounts occur in path /data/mounts/

Show mounts

Creating a mount

For fast read-only access, link folders with the CLI command workspace-ctl data create-mount --mode read-only.

circle-info

This has the same effect as using the --mode read-only option because this is applied by default when using workspace-ctl data create-mount .

Removing a mount

Last updated

Was this helpful?