LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page
  • System Information
  • Nextflow Version
  • Compute Node
  • Compute Type
  • Inputs
  • Outputs
  • Nextflow Configuration

Was this helpful?

Export as PDF
  1. Project
  2. Flow
  3. Pipelines

Nextflow

PreviousPipelinesNextCWL

Last updated 4 days ago

Was this helpful?

ICA supports running pipelines defined using . See for an example.

In order to run Nextflow pipelines, the following process-level attributes within the Nextflow definition must be considered.

System Information

Info
Details

Nextflow version

20.10.0 (deprecated *), 22.04.3, 24.10.2 (Experimental)

Executor

Kubernetes

(*) Pipelines will still run when 20.10.0 will be deprecated, but you will no longer be able to choose it when creating new pipelines.

Nextflow Version

You can select the Nextflow version while building a pipeline as follows:

interface

GUI

Select the Nextflow version at Projects > your_project > flow > pipelines > your_pipeline > Details tab.

API

Select the Nextflow version by setting it in the optional field "pipelineLanguageVersionId". When not set, a default Nextflow version will be used for the pipeline.

Compute Node

For each compute type, you can choose between the scheduler.illumina.com/lifecycle: standard (default - AWS on-demand) or scheduler.illumina.com/lifecycle: economy (AWS spot instance) tiers.

Compute Type

To specify a compute type for a Nextflow process, use the within each process. Set the annotation to scheduler.illumina.com/presetSize and the value to the desired compute type. A list of available compute types can be found . The default compute type, when this directive is not specified, is standard-small (2 CPUs and 8 GB of memory).

pod annotation: 'scheduler.illumina.com/presetSize', value: 'fpga-medium'
process foo {
    // Assuming that params.compute_size is set to a valid size such as 'standard-small', 'standard-medium', etc.
    pod annotation: 'scheduler.illumina.com/presetSize', value: "${params.compute_size}"
}
// Set the default pod
pod = [
    annotation: 'scheduler.illumina.com/presetSize',
    value     : 'standard-small'
]

withName: 'big_memory_process' {
    pod = [
        annotation: 'scheduler.illumina.com/presetSize',
        value     : 'himem-large'
    ]
}

// Use an FPGA instance for dragen processes
withLabel: 'dragen' {
    pod = [
        annotation: 'scheduler.illumina.com/presetSize',
        value     : 'fpga-medium'
    ]
}

Inputs

Outputs

publishDir 'out', mode: 'symlink'
Nextflow version 20.10.10 (Deprecated)

For Nextflow version 20.10.10 on ICA, using the "copy" method in the publishDir directive for uploading output files that consume large amounts of storage may cause workflow runs to complete with missing files. The underlying issue is that file uploads may silently fail (without any error messages) during the publishDir process due to insufficient disk space, resulting in incomplete output delivery.

Solutions:

Nextflow Configuration

Syntax highlighting is determined by the file type, but you can select alternative syntax highlighting with the drop-down selection list.

If no Docker image is specified, Ubuntu will be used as default.

The following configuration settings will be ignored if provided as they are overridden by the system:

executor.name
executor.queueSize
k8s.namespace
k8s.serviceAccount
k8s.launchDir
k8s.projectDir
k8s.workDir
k8s.storageClaimName
k8s.storageMountPath
trace.enabled
trace.file
trace.fields
timeline.enabled
timeline.file
report.enabled
report.file
dag.enabled
dag.file

Often, there is a need to select the compute size for a process dynamically based on user input and other factors. The Kubernetes executor used on ICA does not use the cpu and memorydirectives, so instead, you can dynamically set the pod directive, as mentioned . e.g.

Additionally, it can also be specified in the . Example configuration file:

Inputs are specified via the or JSON-based input form. The specified code in the XML will correspond to the field in the params object that is available in the workflow. Refer to the for an example.

Outputs for Nextflow pipelines are uploaded from the out folder in the attached shared filesystem. The can be used to symlink (recommended), copy or move data to the correct folder. Data will be uploaded to the ICA project after the pipeline execution completes.

Use "" instead of "copy" in the publishDir directive. Symlinking creates a link to the original file rather than copying it, which doesn’t consume additional disk space. This can prevent the issue of silent file upload failures due to disk space limitations.

Use Nextflow 22.04.0 or later and enable the "" publishDir option. This option ensures that the workflow will fail and provide an error message if there's an issue with publishing files, rather than completing silently without all expected outputs.

During execution, the Nextflow pipeline runner determines the environment settings based on values passed via the command-line or via a configuration file (see ). When creating a Nextflow pipeline, use the nextflow.config tab in the UI (or API) to specify a nextflow configuration file to be used when launching the pipeline.

here
configuration file
XML input form
tutorial
publishDir directive
symlink
failOnError
Nextflow Configuration documentation
Nextflow
this tutorial
pod directive
nextflowconfig-0
here