LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page
  • Introduction
  • Prerequisites
  • NextFlow Requirements / Best Practices
  • Pipeline Development Tools
  • Usage
  • 1) Starting a new Project
  • 2) Running in Bench
  • 3) Deploying to ICA Flow
  • 4) Launching Validation in Flow
  • Tutorials

Was this helpful?

Export as PDF
  1. Project
  2. Bench

Pipeline Development in Bench (Experimental)

Introduction

The Pipeline Development Kit in Bench makes it easy to create Nextflow pipelines for ICA Flow. This kit consists of a number of development tools which are installed in /data/.software (regardless of which Bench image is selected) and provides the following features:

  • Import to Bench

    • From public nf-core pipelines

    • From existing ICA Flow Nextflow pipelines

  • Run in Bench

  • Modify and re-run in Bench, providing fast development iterations

  • Deploy to Flow

  • Launch validation in Flow

Prerequisites

  • Recommended workspace size: Nf-core Nextflow pipelines typically require 4 or more cores to run.

  • The pipeline development tools require

    • Conda which is automatically installed by “pipeline-dev” if conda-miniconda.installer.ica-userspace.sh is present in the image.

    • Nextflow (version 24.10.2 is automatically installed using conda, or you can use other versions)

    • git (automatically installed using conda)

    • jq, curl (which should be made available in the image)

NextFlow Requirements / Best Practices

Pipeline development tools work best when the following items are defined:

  • Nextflow profiles:

    • test profile, specifying inputs appropriate for a validation run

    • docker profile, instructing NextFlow to use Docker

ICA Flow adds one additional constraint. The output directory out is the only one automatically copied to the Project data when an ICA Flow Analysis completes. The -outdir parameter recommended by nf-core should therefore be set to--outdir=out when running as a Flow pipeline.

Pipeline Development Tools

These are installed in /data/.software (which should be in your $PATH), the pipeline-dev script is the front-end to the other pipeline-dev-* tools.

Pipeline-dev fulfils a number of roles:

  • Checks that the environment contains the required tools (conda, nextflow, etc) and offers to install them if needed.

  • Checks that the fast data mounts are present (/data/mounts/project etc.) – it is useful to check regularly, as they get unmounted when a workspace is stopped and restarted.

  • Redirects stdout and stderr to .pipeline-dev.log, with the history of log files kept as .pipeline-dev.log.<log date>.

  • Launches the appropriate sub-tool.

  • Prints out errors with backtrace, to help report issues.


Usage

1) Starting a new Project

A pipeline-dev project relies on the following Folder structure, which is auto-generated when using the pipeline-dev import* tools.

If you start a project manually, you must follow the same folder structure.

  • Project base folder

    • nextflow-src: Platform-agnostic Nextflow code, for example the github contents of an nf-core pipeline, or your usual nextflow source code.

      • main.nf

      • nextflow.config

      • nextflow_schema.json

    • pipeline-dev.project-info: contains project name, description, etc.

    • nextflow-bench.config (automatically generated when needed): contains definitions for bench.

    • ica-flow-config: Directory of files used when deploying pipeline to Flow.

      • inputForm.json (if not present, gets generated from nextflow-src/nextflow_schema.json): input form as defined in ICA Flow.

      • onSubmit.js, onRender.js (optional, generated at the same time as inputForm.json): javascript code to go with the input form.

      • launchPayload_inputFormValues.json (if not present, gets generated from the test profile): used by “pipeline-dev launch-validation-in-flow”.

Pipeline Sources

$ pipeline-dev import-from-nextflow <repo name e.g. nf-core/demo>

A directory with the same name as the nextflow/nf-core pipeline is created, and the Nextflow files are pulled into the nextflow-src subdirectory.

$ pipeline-dev import-from-flow [--analysis-id=…] 

A directory called imported-flow-analysis is created and the analysis+pipeline assets are downloaded.

Currently only pipelines with publicly available Docker images are supported. Pipelines with ICA-stored images are not yet supported.


2) Running in Bench

$ pipeline-dev run-in-bench [--local|--sge] 

Optional parameters --local / --sge can be added to force the execution on the local workspace node, or on the workspace cluster (when available). Otherwise, the presence of a cluster is automatically detected and used.

The script then launches nextflow. The full nextflow command line is printed and launched.

In case of errors, full logs are saved as .pipeline-dev.log

Currently, not all corner cases are covered by command line options. Please start from the nextflow command printed by the tool and extend it based on your specific needs.

Output Example

Container (Docker) images

Nextflow can run processes with and without Docker images. In the context of pipeline development, the pipeline-dev tools assume Docker images are used, in particular during execution with the nextflow --profile docker.

In NextFlow, Docker images can be specified at the process level

  • This is done with the container "<image_name:version>" directive, which can be specified

    • in nextflow config files (preferred method when following the nf-core best practices)

    • or at the start of each process definition.

  • Each process can use a different docker image

  • It is highly recommended to always specify an image. If no Docker image is specified, Nextflow will report this. In ICA, a basic image will be used but with no guarantee that the necessary tools are available.


3) Deploying to ICA Flow

$ pipeline-dev deploy-as-flow-pipeline [--create|--update] 

This command does the following:

  1. Generate the JSON file describing the ICA Flow user interface.

    • If ica-flow-config/inputForm.json doesn’t exist: generate it from nextflow-src/nextflow_swagger.json .

  2. Generate the JSON file containing the validation launch inputs.

    • If ica-flow-config/launchPayload_inputFormValues.json doesn’t exist: generate it from nextflow --profile test inputs.

    • If local files are used as validation inputs or as default input values:

      • copy them to /data/project/pipeline-dev-files/temp .

      • get their ICA file ids.

      • use these file ids in the launch specifications.

    • If remote files are used as validation inputs or as default input values of an input of type “file” (and not “string”): do the same as above.

  3. Identify the pipeline name to use for this new pipeline deployment:

    • If a deployment has already occurred in this project, or if the project was imported from an existing Flow pipeline, start from this pipeline name. Otherwise start from the project name.

    • Identify which already-deployed pipelines have the same base name, with or without suffixes that could be some versioning (_v<number>, _<number>, _<date>) .

    • Ask the user if they prefer to update the current version of the pipeline, create a new version, or enter a new name of their choice – or use the --create/--update parameters when specified, for scripting without user interactions.

  4. New ICA Flow pipeline gets created (except in case of pipeline update) .

    • The current Nextflow version in Bench is used to select the best Nextflow version to be used in Flow

  5. nextflow-src folder is uploaded file by file as pipeline assets.

Output Example:

The pipeline name, id and URL are printed out, and if your environment allows, Ctrl+Click/Option+Click/Right click can open the URL in a browser.

Opening the URL of the pipeline and clicking on Start Analysis shows the generated user interface:


4) Launching Validation in Flow

$ pipeline-dev launch-validation-in-flow 

The ica-flow-config/launchPayload_inputFormValues.json file generated in the previous step is submitted to ICA Flow to start an analysis with the same validation inputs as “nextflow --profile test”.

Output Example:

The analysis name, id and URL are printed out, and if your environment allows, Ctrl+Click/Option+Click/Right click can open the URL in a browser.


Tutorials

PreviousBench Command Line InterfaceNextCreating a Pipeline from Scratch

Last updated 4 days ago

Was this helpful?

nextflow_schema.json, as described . This is useful for the launch UI generation. The nf-core CLI tool (installable via pip install nf-core) offers extensive help to create and maintain this schema.

The above-mentioned project structure must be generated manually. The nf-core CLI tools can assist to generate the nextflow_schema.json. Tutorial goes into more details about this use case.

Tutorial goes into more details about this use case.

Tutorial goes into more details about this use case.

Resources such as #cpu and memory can be specified as described See or our for details about Nextflow-Docker syntax.

Bench can push/pull/create/modify Docker images, as described in .

🆕
here
Pipeline from Scratch
Nf Core Pipelines
Updating an Existing Flow Pipeline
Containers
Creating a Pipeline from Scratch
nf-core Pipelines
Updating an Existing Flow Pipeline
here
containers
tutorials
Nextflow output
launch-validation-in-flow