LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page
  • Nextflow CLI Workflow
  • Installation
  • Tutorial project
  • main.nf
  • Docker image upload

Was this helpful?

Export as PDF
  1. Tutorials
  2. Nextflow Pipeline

Nextflow CLI Workflow

PreviousNextflow: Pipeline Lift: RNASeqNextCWL CLI Workflow

Last updated 3 days ago

Was this helpful?

Nextflow CLI Workflow

In this tutorial, we will demonstrate how to create and launch a Nextflow pipeline using the ICA command line interface (CLI).

Installation

Please refer to for installing ICA CLI. To authenticate, please follow the steps in the page.

Tutorial project

In this tutorial, we will create in ICA. The workflow includes four processes: index creation, quantification, FastQC, and MultiQC. We will also upload a Docker container to the ICA Docker repository for use within the workflow.

The 'main.nf' file defines the workflow that orchestrates various RNASeq analysis processes.

main.nf

nextflow.enable.dsl = 2

process INDEX {
   input:
       path transcriptome_file

   output:
       path 'salmon_index'

   script:
       """
       salmon index -t $transcriptome_file -i salmon_index
       """
}

process QUANTIFICATION {
   publishDir 'out', mode: 'symlink'

   input:
       path salmon_index
       tuple path(read1), path(read2)
       val(quant)

   output:
       path "$quant"

   script:
       """
       salmon quant --libType=U -i $salmon_index -1 $read1 -2 $read2 -o $quant
       """
}

process FASTQC {

   input:
       tuple path(read1), path(read2)

   output:
       path "fastqc_logs"

   script:
       """
       mkdir fastqc_logs
       fastqc -o fastqc_logs -f fastq -q ${read1} ${read2}
       """
}

process MULTIQC {
   publishDir 'out', mode:'symlink'

   input:
       path '*'

   output:
       path 'multiqc_report.html'

   script:
       """
       multiqc .
       """
}

workflow {
   index_ch = INDEX(Channel.fromPath(params.transcriptome_file))
   quant_ch = QUANTIFICATION(index_ch, Channel.of([file(params.read1), file(params.read2)]),Channel.of("quant"))
   fastqc_ch = FASTQC(Channel.of([file(params.read1), file(params.read2)]))
   MULTIQC(quant_ch.mix(fastqc_ch).collect())
}

The script uses the following tools:

  1. Salmon: Software tool for quantification of transcript abundance from RNA-seq data.

  2. FastQC: QC tool for sequencing data

  3. MultiQC: Tool to aggregate and summarize QC reports

Docker image upload

docker pull nextflow/rnaseq-nf

Create a tarball of the image to upload to ICA.

docker save nextflow/rnaseq-nf > cont_rnaseq.tar

Following are lists of commands that you can use to upload the tarball to your project.

# Enter the project context
icav2 enter docs
# Upload the container image to the root directory (/) of the project
icav2 projectdata upload cont_rnaseq.tar /

Add the image to the ICA Docker repository

The uploaded image can be added to the ICA docker repository from the ICA Graphical User Interface (GUI).

Change the format for the image tarball to DOCKER:

  1. Navigate to Projects > <your_project> Data

  2. Check the checkbox for the uploaded tarball

  3. Click on "Manage" dropdown

  4. Click on "Change format" In the new popup window, select "DOCKER" format and hit save.

To add this image to the ICA Docker repository, first click on "All Projects" to go back to the home page.

  1. From the ICA home page, click on the "Docker Repository" page under "System Settings"

  2. Click the "+ New" button to open the "New Docker Image" window.

  3. In the new window, click on the "Select a file with DOCKER format"

This will open a new window that lets you select the above tarball.

  1. Select the region (US, EU, CA) your project is in.

  2. Select your project. You can start typing the name in the textbox to filter it.

  3. The bottom pane will show the "Data" section of the selected project. If you have the docker image in subfolders, browse the folders to locate the file. Once found, click on the checkbox corresponding to the image and press "Select".

You will be taken back to the "New Docker image" window. The "Data" and "Name" fields will have been populated based on the imported image. You can edit the "Name" field to rename it. For this tutorial, we will change the name to "rnaseq". Select the region, and give it a version number, and description. Click on "Save".

If you have the images hosted in other repositories, you can add them as external image by clicking the "+ New external image" button and completing the form as shown in the example below.

After creating a new docker image, you can double click on the image to get the container URL for the nextflow configuration file.

Nextflow configuration file

Create a configuration file called "nextflow.config" in the same folder as the main.nf file above. Use the URL copied above to add the process.container line in the config file.

process.container = '079623148045.dkr.ecr.us-east-1.amazonaws.com/cp-prod/3cddfc3d-2431-4a85-82bb-dae061f7b65d:latest'
process {
    container = '079623148045.dkr.ecr.us-east-1.amazonaws.com/cp-prod/3cddfc3d-2431-4a85-82bb-dae061f7b65d:latest'
    pod = [
        annotation: 'scheduler.illumina.com/presetSize',
        value: 'standard-small'
    ]  
}

Parameters file

An empty form looks as follows:

<pipeline code="" version="1.0" xmlns="xsd://www.illumina.com/ica/cp/pipelinedefinition">
   <dataInputs>
   </dataInputs>
   <steps>
   </steps>
</pipeline>

The input files are specified within a single dataInputs node with individual input file specified in a separate dataInput node. Settings (as opposed to files) are specified within the steps node. Settings represent any non-file input to the workflow, including but not limited to, strings, booleans, integers, etc..

For this tutorial, we do not have any settings parameters but it requires multiple file inputs. The parameters.xml file looks as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pd:pipeline xmlns:pd="xsd://www.illumina.com/ica/cp/pipelinedefinition" code="" version="1.0">
   <pd:dataInputs>
       <pd:dataInput code="read1" format="FASTQ" type="FILE" required="true" multiValue="false">
           <pd:label>FASTQ Read 1</pd:label>
           <pd:description>FASTQ Read 1</pd:description>
       </pd:dataInput>
       <pd:dataInput code="read2" format="FASTQ" type="FILE" required="true" multiValue="false">
           <pd:label>FASTQ Read 2</pd:label>
           <pd:description>FASTQ Read 2</pd:description>
       </pd:dataInput>
       <pd:dataInput code="transcriptome_file" format="FASTA" type="FILE" required="true" multiValue="false">
           <pd:label>Transcript</pd:label>
           <pd:description>Transcript faster</pd:description>
       </pd:dataInput>
   </pd:dataInputs>
   <pd:steps/>
</pd:pipeline>

Use the following commands to create the pipeline with the above workflow in your project.

If not already in the project context, enter it by using the following command:

icav2 enter <PROJECT NAME or ID>

Create pipeline using icav2 project pipelines create nextflow Example:

icav2 projectpipelines create nextflow rnaseq-docs --main main.nf --parameter parameters.xml --config nextflow.config --storage-size small --description 'cli nextflow pipeline'

If you prefer to organize the processes in different folders/files, you can use --other parameter to upload the different processes as additional files. Example:

icav2 projectpipelines create nextflow rnaseq-docs --main main.nf --parameter parameters.xml --config nextflow.config --other index.nf:filename=processes/index.nf --other quantification.nf:filename=processes/quantification.nf --other fastqc.nf:filename=processes/fastqc.nf --other multiqc.nf:filename=processes/multiqc.nf --storage-size small --description 'cli nextflow pipeline'

Example command to run the pipeline from CLI:

icav2 projectpipelines start nextflow <pipeline_id> --input read1:<read1_file_id> --input read2:<read2_file_id> --input transcriptome_file:<transcriptome_file_id> --storage-size small --user-reference demo_run

You can get the pipeline id under "ID" column by running the following command:

icav2 projectpipelines list

You can get the file ids under "ID" column by running the following commands:

icav2 projectdata list

Additional Resources:

We need a Docker container consisting of these tools. You can refer to the section in the help page to build your own docker image with the required tools. For the sake of this tutorial, we will use the container from the

With in your computer, download the image required for this project using the following command.

You can add a pod directive within a process or in the config file to specify a compute type. The following is an example of a configuration file with the 'standard-small' compute type for all processes. Please refer to the page for a list of available compute types.

The parameters file defines the workflow input parameters. Refer to the for detailed information for creating correctly formatted parameters files.

You can refer to page to explore options to automate this process.

Refere to for details on running the pipeline from CLI.

Please refer to command help (icav2 [command] --help) to determine available flags to filter output of above commands if necessary. You can also refer to page for available flags for the icav2 commands.

For more help on uploading data to ICA, please refer to the page.

these instructions
Authentication
Simple RNA-Seq workflow
"Build and push to ICA your own Docker image"
original tutorial
Docker installed
Compute Types
help page
Nextflow: Pipeline Lift
Launch Pipelines on CLI
Command Index
Data Transfer options
ICA - Nextflow
ICA - Nextflow: Pipeline Lift
ICA - Launch Pipelines on CLI
ICA - Data Transfers
create docker