Pipelines
Last updated
Last updated
A Pipeline is a series of Tools with connected inputs and outputs configured to execute in a specific order.
Pipelines are created and stored within projects.
Navigate to Projects > your_project > Flow > Pipelines.
Select CWL or Nextflow to create a new Pipeline.
Configure pipeline settings in the pipeline property tabs.
When creating a graphical CWL pipeline, drag connectors to link tools to input and output files in the canvas. Required tool inputs are indicated by a yellow connector.
Select Save.
Pipelines use the latest tool definition when the pipeline was last saved. Tool changes do not automatically propagate to the pipeline. In order to update the pipeline with the latest tool changes, edit the pipeline definition by removing the tool and re-adding it back to the pipeline.
Individual Pipeline files are limited to 20 Megabytes. If you need to add more than this, split your content over multiple files.
Pipelines use the latest tool definition when the pipeline was last saved. Tool changes do not automatically propagate to the pipeline. In order to update the pipeline with the latest tool changes, edit the pipeline definition by removing the tool and re-adding it back to the pipeline.
You can edit pipelines while they are in Draft or Release Candidate status. Once released, pipelines can no longer be edited.
The following sections describe the tool properties that can be configured in each tab of the pipeline editor.
Depending on how you design the pipeline, the displayed tabs differ between the graphical and code definitions. For CWL you have a choice on how to define the pipeline, Nextflow is always defined in code mode.
Any additional source files related to your pipeline will be displayed here in alphabetical order.
See the following pages for language-specific details for defining pipelines:
The details tab provides options for configuring basic information about the pipeline.
The following information becomes visible when viewing the pipeline details.
The clone action will be shown in the pipeline details at the top-right. Cloning a pipeline allows to create modifications without impacting the original pipeline. When cloning a pipeline, you become the owner of the cloned pipeline.
When you clone a Nextflow pipeline, a verification of the configured Nextflow version is done to ensure no deprecated versions are used.
The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the tool repository but is excluded from exported CWL definitions. If no documentation has been provided, this tab will be empty.
When using graphical mode for the pipeline definition, the Definition tab provides options for configuring the pipeline using a visualization panel and a list of component menus.
In graphical mode, you can drag and drop inputs into the visualization panel to connect them to the tools. Make sure to connect the input icons to the tool before editing the input details in the component menu. Required tool inputs are indicated by a yellow connector.
This page is used to specify all relevant information about the pipeline parameters.
The Analysis Report tab provides options for configuring pipeline execution reports. The report is composed of widgets added to the tab.
The pipeline analysis report appears in the pipeline execution results. The report is configured from widgets added to the Analysis Report tab in the pipeline editor.
[Optional] Import widgets from another pipeline.
Select Import from other pipeline.
Select the pipeline that contains the report you want to copy.
Select an import option: Replace current report or Append to current report.
Select Import.
From the Analysis Report tab, select Add widget, and then select a widget type.
Configure widget details.
Select Save.
See Metadata Models
The Common Workflow Language main script.
The Nextflow configuration settings.
The Nextflow project main script.
Multiple files can be added to make pipelines more modular and manageable.
Syntax highlighting is determined by the file type, but you can select alternative syntax highlighting with the drop-down selection list. The following formats are supported:
DIFF (.diff)
GROOVY (.groovy .nf)
JAVASCRIPT (.js .javascript)
JSON (.json)
SH (.sh)
SQL (.sql)
TXT (.txt)
XML (.xml)
YAML (.yaml .cwl)
For each process defined by the workflow, ICA will launch a compute node to execute the process.
For each compute type, the standard
(default - AWS on-demand) or economy
(AWS spot instance) tiers can be selected.
When selecting an fpga instance type for running analyses on ICA, it is recommended to use the medium size. While the large size offers slight performance benefits, these do not proportionately justify the associated cost increase for most use cases.
When no type is specified, the default type of compute node is standard-small
.
By default, compute nodes have no scratch space. This is an advanced setting and should only be used when absolutely necessary as it will incur additional costs and may offer only limited performance benefits because it is not local to the compute node.
For simplicity and better integration, consider using shared storage available at /ces
. It is what is provided in the Small/Medium/Large+ compute types. This shared storage is used when writing files with relative paths.
Daemon sets and system processes consume approximately 1CPU and 2GB Mem from the base values shown in the table. Consumption will vary based on the activity of the pod.
* The compute type "fpga-small" is no longer available. Use 'fpga-medium' instead. fpga-large offers little performance benefit at additional cost.
** The transfer size selected is based on the selected storage size for compute type and used during upload and download system tasks.
Use the following instructions to start a new analysis for a single pipeline.
Select a project.
From the project menu, select Flow > Pipelines.
Select the pipeline or pipeline details of the pipeline you want to run.
Select Start Analysis.
Configure analysis settings. See Analysis Properties.
Select Start Analysis.
View the analysis status on the Analyses page.
Requested—The analysis is scheduled to begin.
In Progress—The analysis is in progress.
Succeeded—The analysis is complete.
Failed and Failed Final—The analysis has failed or was aborted.
To end an analysis, select Abort.
To perform a completed analysis again, select Re-run.
The following sections describe the analysis properties that can be configured in each tab.
The Analysis tab provides options for configuring basic information about the analysis.
You can view analysis results on the Analyses page or in the output_folder on the Data page.
Select a project, and then select the Flow > Analyses page.
Select an analysis.
On the Result tab, select an output file.
To preview the file, select the View tab.
Add or remove any user or technical tags, and then select Save.
To download, select Schedule for Download.
View additional analysis result information on the following tabs:
Details - View information on the pipeline configuration.
Steps - stderr and stdout information
Timeline Report - Nextflow process execution timeline.
Execution Report - Nextflow analysis report. Showing the run times, commands, resource usage and tasks for Nextflow analyses.
Field | Entry |
---|---|
Field | Entry |
---|---|
Menu | Description |
---|---|
Widget | Settings |
---|---|
Placeholder | Description |
---|---|
Field | Entry |
---|---|