Pipelines

A Pipeline is a series of Tools with connected inputs and outputs configured to execute in a specific order.

Pipeline Status

Pipelines can only be edited when they are in "Draft" or "Release Candidate" status. Pipeline can only be moved to "Released" Status, when all the Tools in the pipeline are ALSO in "Released" status.

StatusDescription

Draft

Fully editable draft.

Release Candidate

The pipeline is ready for release. Editing is locked but the pipeline can be cloned (top right in the details view) to create a new version.

Released

The pipeline is released. A pipeline cannot be released if it contains unreleased tools. Editing is locked but the pipeline can be cloned (top right in the details view) to create a new version.

Create a Pipeline

Pipelines are created and stored within projects.

  1. Select a project.

  2. From the project menu, select Flow > Pipelines.

  3. Select CWL or Nextflow to create a new Pipeline.

  4. Configure pipeline settings in the pipeline properties tabs.

  5. When creating a graphical CWL pipeline, drag connectors to link tools to input and output files in the canvas. Required tool inputs are indicated by a yellow connector.

  6. Select Save.

❗️ Pipelines use the latest tool definition when the pipeline was last saved. Tool changes do not automatically propagate to the pipeline. In order to update the pipeline with the latest tool changes, edit the pipeline definition by removing the tool and re-adding it back to the pipeline.

Pipeline Properties

The following sections describe the tool properties that can be configured in each tab of the pipeline editor.

Graphical vs Code definition

Depending on how you design the pipeline, the displayed tabs differ between the graphical and code definitions. For CWL you have a choice on how to define the pipeline, Nextflow is always defined in code mode.

LanguageGraphical (CWL)Code (CWL)Code (Nextflow)

Information

x

x

x

Documentation

x

x

x

Definition

x

XML Configuration

x

x

Analysis Report

x

Metadata Model

x

x

x

workflow.cwl

x

nextflow.config

x

main.nf

x

New File

x

x

Any additional source files related to your pipeline will be displayed here in alphabetical order.

See the following pages for language-specific details for defining pipelines:

Information

The Information tab provides options for configuring basic information about the pipeline. (small differences exist between the Nextflow and CWL versions)

FieldEntry

Code

The name of the pipeline.

Categories

One or more tags to categorize the pipeline. Select from existing tags or type a new tag name in the field.

Description

A short description of the pipeline.

Proprietary

Hide the pipeline scripts and details from users who do not belong to the tenant who owns the pipeline. This also prevents cloning the pipeline.

Status

The release status of the pipeline.

Storage size

User selectable storage size for running the pipeline. This must be large enough to run the pipeline, but setting it too large incurs unnecessary costs.

Family

A group of pipeline versions. To specify a family, select Change, and then select a pipeline or pipeline family. To change the order of the pipeline, select Up or Down. The first pipeline listed is the default and the remainder of the pipelines are listed as Other versions. The current pipeline appears in the list as this pipeline.

Version comment

A description of changes in the updated version.

Links

External reference links.

The following information becomes visible when viewing the pipeline.

FieldEntry

ID

Unique Identifier of the pipeline.

URN

Identification of the pipeline in Uniform Resource Name

Nextflow Version

User selectable Nextflow version available only for Nextflow pipelines

In addition, the clone function will be shown (top-right). When cloning a pipeline, you become the owner of the cloned pipeline.

Documentation

The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the tool repository but is excluded from exported CWL definitions. If no documentation has been provided, this tab will be empty.

Definition (Graphical)

When using graphical mode for the pipeline definition, the Definition tab provides options for configuring the pipeline using a visualization panel and a list of component menus.

MenuDescription

Machine profiles

Compute types available to use with Tools in the pipeline.

Shared settings

Settings for pipelines used in more than one tool.

Reference files

Descriptions of reference files used in the pipeline.

Input files

Descriptions of input files used in the pipeline.

Output files

Descriptions of output files used in the pipeline.

Tool

Details about the tool selected in the visualization panel.

Tool repository

A list of tools available to be used in the pipeline.

❗️ In graphical mode, you can drag and drop inputs into the visualization panel to connect them to the tools. Make sure to connect the input icons to the tool before editing the input details in the component menu. Required tool inputs are indicated by a yellow connector.

XML Configuration (code)

This page is used to specify all relevant information about the pipeline parameters.

Analysis Report (Graphical)

The Analysis Report tab provides options for configuring pipeline execution reports. The report is composed of widgets added to the tab.

Configure Pipeline Analysis Report (Graphical CWL Only)

The pipeline analysis report appears in the pipeline execution results. The report is configured from widgets added to the Analysis Report tab in the pipeline editor.

  1. [Optional] Import widgets from another pipeline.

    1. Select Import from other pipeline.

    2. Select the pipeline that contains the report you want to copy.

    3. Select an import option: Replace current report or Append to current report.

    4. Select Import.

  2. From the Analysis Report tab, select Add widget, and then select a widget type.

  3. Configure widget details.

    WidgetSettings

    Title

    Add and format title text.

    Analysis details

    Add heading text and select the analysis metadata details to display.

    Free text

    Add formatted free text. The widget includes options for placeholder variables that display the corresponding project values.

    Inline viewer

    Add options to view the content of an analysis output file.

    Analysis comments

    Add comments that can be edited after an analysis has been performed.

    Input details

    Add heading text and select the input details to display. The widget includes an option to group details by input name.

    Project details

    Add heading text and select the project details to display.

    Page break

    Add a page break widget where page breaks should appear between report sections.

  4. Select Save.

Free Text Placeholders

PlaceholderDescription

[[BB_PROJECT_NAME]]

The project name.

[[BB_PROJECT_OWNER]]

The project owner.

[[BB_PROJECT_DESCRIPTION]]

The project short description.

[[BB_PROJECT_INFORMATION]]

The project information.

[[BB_PROJECT_LOCATION]]

The project location.

[[BB_PROJECT_BILLING_MODE]]

The project billing mode.

[[BB_PROJECT_DATA_SHARING]]

The project data sharing settings.

[[BB_REFERENCE]]

The analysis reference.

[[BB_USERREFERENCE]]

The user analysis reference.

[[BB_PIPELINE]]

The name of the pipeline.

[[BB_USER_OPTIONS]]

The analysis user options.

[[BB_TECH_OPTIONS]]

The analysis technical options. Technical options include the TECH suffix and are not visible to end users.

[[BB_ALL_OPTIONS]]

All analysis options. Technical options include the TECH suffix and are not visible to end users.

[[BB_SAMPLE]]

The sample.

[[BB_REQUEST_DATE]]

The analysis request date.

[[BB_START_DATE]]

The analysis start date.

[[BB_DURATION]]

The analysis duration.

[[BB_REQUESTOR]]

The user requesting analysis execution.

[[BB_RUNSTATUS]]

The status of the analysis.

[[BB_ENTITLEMENTDETAIL]]

The used entitlement detail.

[[BB_METADATA:path]]

The value or list of values of a metadata field or multi-value fields.

Metadata Model

See Metadata Models

Workflow.cwl (code)

The Common Workflow Language main script.

Nextflow.config (code)

The Nextflow configuration settings.

Main.nf (code)

The Nextflow project main script.

+ New File (code)

Multiple files can be added to make pipelines more modular and manageable.

Compute Nodes

For each process defined by the workflow, ICA will launch a compute node to execute the process.

  • For each compute type, the standard (default) or economy tiers can be selected, which corresponds to AWS on-demand or spot instance types, respectively.

  • Compute nodes will have no scratch space by default. You can add scratch space in your pipeline design.

    • For example pod annotation: 'volumes.illumina.com/scratchSize', value: '1TiB' will reserve 1 TiB for Nextflow.

    • hints: - class: ResourceRequirement tmpdirMin: 5000 will reserve 5GiB for CWL.

  • Type of a compute node can be specified - see table below.

  • When selecting an instance type for running analyses on ICA, particularly within the 'fpga' category, we recommend opting for the "medium" flavor. Through comprehensive testing and feedback, we've observed that the performance gains achieved with the "large" flavor, while present, do not proportionately justify the associated increase in costs for most use cases.

  • The default type of a compute node, if a type is not specified, is standard-small.

Compute Types

Daemon sets and system processes consume approximately 1CPU and 2GB Mem from the base values shown in the table. Consumption will vary based on the activity of the pod.

Compute Type

CPUs

Mem (GB)

Nextflow (pod.value)

CWL (type, size)

standard-small

2

8

standard-small

standard, small

standard-medium

4

16

standard-medium

standard, medium

standard-large

8

32

standard-large

standard, large

standard-xlarge

16

64

standard-xlarge

standard, xlarge

standard-2xlarge

32

128

standard-2xlarge

standard, 2xlarge

hicpu-small

16

32

hicpu-small

hicpu, small

hicpu-medium

36

72

hicpu-medium

hicpu, medium

hicpu-large

72

144

hicpu-large

hicpu, large

himem-small

8

64

himem-small

himem, small

himem-medium

16

128

himem-medium

himem, medium

himem-large

48

384

himem-large

himem, large

himem-xlarge

96

768

himem-xlarge

himem, xlarge

hiio-small

2

16

hiio-small

hiio, small

hiio-medium

4

32

hiio-medium

hiio, medium

fpga-small[^1]

8

122

fpga-small

fpga, small

fpga-medium

16

244

fpga-medium

fpga, medium

fpga-large

64

976

fpga-large

fpga, large

❗️ The compute type "fpga-small" is no longer available. The type 'fpga-medium' should be generally preferred over 'fpga-large'.

Start a New Analysis

Use the following instructions to start a new analysis for a single pipeline.

  1. Select a project.

  2. From the project menu, select Flow > Pipelines.

  3. Select the pipeline to run.

  4. Select Start a New Analysis.

  5. Configure analysis settings. See Analysis Properties.

  6. Select Start Analysis.

  7. View the analysis status on the Analyses page.

    • Requested—The analysis is scheduled to begin.

    • Awaiting Input—The input file download is in progress.

    • In Progress—The analysis is in progress.

    • Succeeded—The analysis is complete.

    • Failed and Failed Final—The analysis has failed or was aborted.

  8. To end an analysis, select Abort.

  9. To perform a completed analysis again, select Re-run.

Alternatively, you can start a new analysis from Projects > <Your_Project> > Flow > Analyses

  1. Select New Analysis

  2. Select Pipeline

  3. Configure analysis settings. See Analysis Properties.

  4. Select Start Analysis.

Analysis Properties

The following sections describe the analysis properties that can be configured in each tab.

Analysis

The Analysis tab provides options for configuring basic information about the analysis.

FieldEntry

User Reference

The unique analysis name.

User tags

One or more tags used to filter the analysis list. Select from existing tags or type a new tag name in the field.

Entitlement Bundle

Select a subscription to charge the analysis to.

Input Files

Select the input files to use in the analysis. (max. 50,000)

Settings

Provide input settings.

View Analysis Results

You can view analysis results on the Analyses page or in the output_folder on the Data page.

  1. Select a project, and then select the Flow > Analyses page.

  2. Select an analysis.

  3. On the Result tab, select an output file.

  4. To preview the file, select the View tab.

  5. Add or remove any user or technical tags, and then select Save.

  6. To download, select Schedule for Download.

  7. View additional analysis result information on the following tabs:

    • Details—View information on the pipeline configuration.

    • Logs—Download information on the pipeline process.

Last updated