LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page
  • How to lift a simple NextFlow pipeline?
  • Creating the pipeline
  • How to modify the main.nf file
  • The XML configuration
  • Running the pipeline

Was this helpful?

Export as PDF
  1. Tutorials
  2. Nextflow Pipeline
  3. Nextflow: Pipeline Lift

Nextflow: Pipeline Lift: RNASeq

PreviousNextflow: Pipeline LiftNextNextflow CLI Workflow

Last updated 3 days ago

Was this helpful?

How to lift a simple NextFlow pipeline?

In this tutorial, we will be using the example RNASeq pipeline to demonstrate the process of lifting a simple Nextflow pipeline over to ICA.

This approach is applicable in situations where your main.nf file contains all your pipeline logic and illustrates what the liftover process would look like.

Creating the pipeline

Select Projects > your_project > Flow > Pipelines. From the Pipelines view, click the +Create pipeline > Nextflow > XML based button to start creating a Nextflow pipeline.

In the Details tab, add values for the required Code (unique pipeline name) and Description fields. Nextflow Version and Storage size defaults to preassigned values.

How to modify the main.nf file

#!/usr/bin/env nextflow

+nextflow.enable.dsl=2
 
/*
 * The following pipeline parameters specify the reference genomes
 * and read pairs and can be provided as command line options
 */
-params.reads = "$baseDir/data/ggal/ggal_gut_{1,2}.fq"
-params.transcriptome = "$baseDir/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa"
params.outdir = "results"

+println("All input parameters: ${params}")
 
workflow {
-    read_pairs_ch = channel.fromFilePairs( params.reads, checkIfExists: true )
+    read_pairs_ch = channel.fromFilePairs("${params.reads}/*_{1,2}.fq")
 
-    INDEX(params.transcriptome)
+    INDEX(Channel.fromPath(params.transcriptome))
     FASTQC(read_pairs_ch)
     QUANT(INDEX.out, read_pairs_ch)
}
 
process INDEX {
-    tag "$transcriptome.simpleName"
+    container 'quay.io/nextflow/rnaseq-nf:v1.1'
+    pod annotation: 'scheduler.illumina.com/presetSize', value: 'standard-medium'
 
    input:
    path transcriptome
 
    output:
    path 'index'
 
    script:
    """
    salmon index --threads $task.cpus -t $transcriptome -i index
    """
}
 
process FASTQC {
+    container 'quay.io/nextflow/rnaseq-nf:v1.1'
+    pod annotation: 'scheduler.illumina.com/presetSize', value: 'standard-medium'

    tag "FASTQC on $sample_id"
    publishDir params.outdir
 
    input:
    tuple val(sample_id), path(reads)
 
    output:
    path "fastqc_${sample_id}_logs"
 
    script:
-    """
-    fastqc.sh "$sample_id" "$reads"
-    """
+    """
+    # we need to explicitly specify the output directory for fastqc tool
+    # we are creating one using sample_id variable
+    mkdir fastqc_${sample_id}_logs
+    fastqc -o fastqc_${sample_id}_logs -f fastq -q ${reads}
+    """
}
 
process QUANT {
+    container 'quay.io/nextflow/rnaseq-nf:v1.1'
+    pod annotation: 'scheduler.illumina.com/presetSize', value: 'standard-medium'

    tag "$pair_id"
    publishDir params.outdir
 
    input:
    path index
    tuple val(pair_id), path(reads)
 
    output:
    path pair_id
 
    script:
    """
    salmon quant --threads $task.cpus --libType=U -i $index -1 ${reads[0]} -2 ${reads[1]} -o $pair_id
    """
}

The XML configuration

In the XML configuration, the input files and settings are specified. For this particular pipeline, you need to specify the transcriptome and the reads folder. Navigate to the XML Configuration tab and paste the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pd:pipeline xmlns:pd="xsd://www.illumina.com/ica/cp/pipelinedefinition" code="" version="1.0">
    <pd:dataInputs>
        <pd:dataInput code="reads" format="UNKNOWN" type="DIRECTORY" required="true" multiValue="false">
            <pd:label>Folder with FASTQ files</pd:label>
            <pd:description></pd:description>
        </pd:dataInput>
        <pd:dataInput code="transcriptome" format="FASTA" type="FILE" required="true" multiValue="false">
            <pd:label>FASTA</pd:label>
            <pd:description>FASTA file</pd:description>
        </pd:dataInput>
    </pd:dataInputs>
    <pd:steps/>
</pd:pipeline>

Click the Generate button (at the bottom of the text editor) to preview the launch form fields.

Click the Save button to save the changes.

Running the pipeline

Go to the Pipelines page from the left navigation pane. Select the pipeline you just created and click Start New Analysis.

Fill in the required fields indicated by red "*" sign and click on Start Analysis button. You can monitor the run from the Analyses page. Once the Status changes to Succeeded, you can click on the run to access the results page.

Copy and paste the into the Nextflow files > main.nf tab. The following comparison highlights the differences between the original file and the version for deployment in ICA. The main difference is the explicit specification of containers and pods within processes. Additionally, some channels' specification are modified, and a debugging message is added. When copying and pasting, be sure to remove the text highlighted in red (marked with -) and add the text highlighted in green (marked with +).

RNASeq Nextflow pipeline
nextflow.io