LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page
  • Installation
  • Tutorial project
  • tool-fqTOfa.cwl
  • tool-countLines.cwl
  • workflow.cwl
  • Authentication
  • Enter/Create a Project
  • Create a pipeline on ICA
  • Running the pipeline
  • Notes
  • runtime.ram and runtime.cpu

Was this helpful?

Export as PDF
  1. Tutorials

CWL CLI Workflow

PreviousNextflow CLI WorkflowNextCWL Graphical Pipeline

Last updated 3 months ago

Was this helpful?

In this tutorial, we will demonstrate how to create and launch a pipeline using the CWL language using the ICA command line interface (CLI).

Installation

Please refer to for installing ICA CLI.

Tutorial project

In this project, we will create two simple tools and build a workflow that we can run on ICA using CLI. The first tool (tool-fqTOfa.cwl) will convert a FASTQ file to a FASTA file. The second tool(tool-countLines.cwl) will count the number of lines in an input FASTA file. The workflow (workflow.cwl) will combine the two tools to convert an input FASTQ file to a FASTA file and count the number of lines in the resulting FASTA file.

Following are the two CWL tools and workflow scripts we will use in the project. If you are new to CWL, please refer to the cwl for a better understanding of CWL codes. You will also need the cwltool installed to create these tools and workflows. You can find installation instructions on the CWL page.

tool-fqTOfa.cwl

#!/usr/bin/env cwltool

cwlVersion: v1.0
class: CommandLineTool
inputs:
  inputFastq:
    type: File
    inputBinding:
        position: 1
stdout: test.fasta
outputs:
  outputFasta:
    type: File
    streamable: true
    outputBinding:
        glob: test.fasta

arguments:
- 'NR%4 == 1 {print ">" substr($0, 2)}NR%4 == 2 {print}'
baseCommand:
- awk

tool-countLines.cwl

#!/usr/bin/env cwltool

cwlVersion: v1.0
class: CommandLineTool
baseCommand: [wc, -l]
inputs:
  inputFasta:
    type: File
    inputBinding:
        position: 1
stdout: lineCount.tsv
outputs:
  outputCount:
    type: File
    streamable: true
    outputBinding:
        glob: lineCount.tsv

workflow.cwl

cwlVersion: v1.0
class: Workflow
inputs:
  ipFQ: File

outputs:
  count_out:
    type: File
    outputSource: count/outputCount
  fqTOfaOut:
    type: File
    outputSource: convert/outputFasta
   
steps:
  convert:
    run: tool-fqTOfa.cwl
    in:
      inputFastq: ipFQ
    out: [outputFasta]

  count:
    run: tool-countLines.cwl
    in:
      inputFasta: convert/outputFasta
    out: [outputCount]

[!IMPORTANT] Please note that we don't specify the Docker image used in both tools. In such a case, the default behaviour is to use public.ecr.aws/docker/library/bash:5 image. This image contains basic functionality (sufficient to execute wc and awk commands).

In case you want to use a different public image, you can specify it using requirements tag in cwl file. Assuming you want to use *ubuntu:latest' you need to add

requirements:
  - class: DockerRequirement
    dockerPull: ubuntu:latest

In case you want to use a Docker image from the ICA Docker repository, you would need the link to AWS ECR from ICA GUI. Double-click on the image name in the Docker repository and copy the URL to the clipboard. Add the URL to dockerPull key.

requirements:
  - class: DockerRequirement
    dockerPull: 079623148045.dkr.ecr.eu-central-1.amazonaws.com/cp-prod/XXXXXXXXXX:latest

Authentication

Enter/Create a Project

You can create a project or use an existing project for creating a new pipeline. You can create a new project using the "icav2 projects create" command.

% icav2 projects create basic-cli-tutorial --region c39b1feb-3e94-4440-805e-45e0c76462bf

If you do not provide the "--region" flag, the value defaults to the existing region when there is only one region available. When there is more than one region available, a selection must be made from the available regions at the command prompt. The region input can be determined by calling the "icav2 regions list" command first.

You can select the project to work on by entering the project using the "icav2 projects enter" command. Thus, you won't need to specify the project as an argument.

% icav2 projects enter basic-cli-tutorial

You can also use the "icav2 projects list" command to determine the names and ids of the project you have access to.

% icav2 projects list

Create a pipeline on ICA

"projectpipelines" is the root command to perform actions on pipelines in a project. "create" command creates a pipeline in the current project.

The parameter file specifies the input for the workflow with additional parameter settings for each step in the workflow. In this tutorial, input is a FASTQ file shown inside <dataInput> tag in the parameter file. There aren't any specific settings for the workflow steps resulting in a parameter file below with an empty <steps> tag. Create a parameter file (parameters.xml) with the following content using a text editor.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pd:pipeline xmlns:pd="xsd://www.illumina.com/ica/cp/pipelinedefinition" code="" version="1.0">
    <pd:dataInputs>
        <pd:dataInput code="ipFQ" format="FASTQ" type="FILE" required="true" multiValue="false">
            <pd:label>ipFQ</pd:label>
            <pd:description></pd:description>
        </pd:dataInput>
    </pd:dataInputs>
    <pd:steps/>
</pd:pipeline>

The following command creates a pipeline called "cli-tutorial" using the workflow "workflow.cwl", tools "tool-fqTOfa.cwl" and "tool-countLines.cwl" and parameter file "parameter.xml" with small storage size.

% icav2 projectpipelines create cwl cli-tutorial --workflow workflow.cwl --tool tool-fqTOfa.cwl --tool tool-countLines.cwl --parameter parameters.xml --storage-size small --description "cli tutorial pipeline"

Once the pipeline is created, you can view it using the "list" command.

% icav2 projectpipelines list
ID                                   	CODE                      	DESCRIPTION                                      
6779fa3b-e2bc-42cb-8396-32acee8b6338	cli-tutorial             	cli tutorial pipeline 

Running the pipeline

@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA
+SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
IIIIIIIIIIIIIIIIIIIIDIIIIIII>IIIIII/
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=36
AGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT
+SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=36
IIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I
@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA
+SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
IIIIIIIIIIIIIIIIIIIIDIIIIIII>IIIIII/
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=36
AGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT
+SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=36
IIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I
@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA
+SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
IIIIIIIIIIIIIIIIIIIIDIIIIIII>IIIIII/
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=36
AGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT
+SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=36
IIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I

The "icav2 projectdata upload" command lets you upload data to ica.

% icav2 projectdata upload test.fastq /
oldFilename= test.fastq en newFilename= test.fastq
bucket= stratus-gds-use1  prefix= 0a488bb2-578b-404a-e09d-08d9e3343b2b/test.fastq
Using: 1 workers to upload 1 files
15:23:32: [0]  Uploading /Users/user1/Documents/icav2_validation/for_tutorial/working/test.fastq
15:23:33: [0]  Uploaded /Users/user1/Documents/icav2_validation/for_tutorial/working/test.fastq to /test.fastq in 794.511591ms
Finished uploading 1 files in 795.244677ms

The "list" command lets you view the uploaded file. Note the ID of the file you want to use with the pipeline.

% icav2 projectdata list                
PATH          NAME        TYPE  STATUS    ID                                    OWNER                                 
/test.fastq  test.fastq FILE  AVAILABLE fil.c23246bd7692499724fe08da020b1014  4b197387-e692-4a78-9304-c7f73ad75e44

The "icav2 projectpipelines start" command initiates the pipeline run. The following command runs the pipeline. Note the id for exploring the analysis later.

Note: If for some reason your "create" command fails and needs to rerun, you might get an error (ConstraintViolationException). If so, try your command with a different name.

% icav2 projectpipelines start cwl cli-tutorial --type-input STRUCTURED --input ipFQ:fil.c23246bd7692499724fe08da020b1014 --user-reference tut-test
analysisStorage.description           1.2 TB
analysisStorage.id                    6e1b6c8f-f913-48b2-9bd0-7fc13eda0fd0
analysisStorage.name                  Small
analysisStorage.ownerId               8ec463f6-1acb-341b-b321-043c39d8716a
analysisStorage.tenantId              f91bb1a0-c55f-4bce-8014-b2e60c0ec7d3
analysisStorage.tenantName            ica-cp-admin
analysisStorage.timeCreated           2021-11-05T10:28:20Z
analysisStorage.timeModified          2021-11-05T10:28:20Z
id                                    461d3924-52a8-45ef-ab62-8b2a29621021
ownerId                               7fa2b641-1db4-3f81-866a-8003aa9e0818
pipeline.analysisStorage.description  1.2 TB
pipeline.analysisStorage.id           6e1b6c8f-f913-48b2-9bd0-7fc13eda0fd0
pipeline.analysisStorage.name         Small
pipeline.analysisStorage.ownerId      8ec463f6-1acb-341b-b321-043c39d8716a
pipeline.analysisStorage.tenantId     f91bb1a0-c55f-4bce-8014-b2e60c0ec7d3
pipeline.analysisStorage.tenantName   ica-cp-admin
pipeline.analysisStorage.timeCreated  2021-11-05T10:28:20Z
pipeline.analysisStorage.timeModified 2021-11-05T10:28:20Z
pipeline.code                         cli-tutorial
pipeline.description                  Test, prepared parameters file from working GUI
pipeline.id                           6779fa3b-e2bc-42cb-8396-32acee8b6338
pipeline.language                     CWL
pipeline.ownerId                      7fa2b641-1db4-3f81-866a-8003aa9e0818
pipeline.tenantId                     d0696494-6a7b-4c81-804d-87bda2d47279
pipeline.tenantName                   icav2-entprod
pipeline.timeCreated                  2022-03-10T13:13:05Z
pipeline.timeModified                 2022-03-10T13:13:05Z
reference                             tut-test-cli-tutorial-eda7ee7a-8c65-4c0f-bed4-f6c2d21119e6
status                                REQUESTED
summary                               
tenantId                              d0696494-6a7b-4c81-804d-87bda2d47279
tenantName                            icav2-entprod
timeCreated                           2022-03-10T20:42:42Z
timeModified                          2022-03-10T20:42:43Z
userReference                         tut-test

You can check the status of the run using the "icav2 projectanalyses get" command.

%   icav2 projectanalyses get 461d3924-52a8-45ef-ab62-8b2a29621021
analysisStorage.description           1.2 TB
analysisStorage.id                    6e1b6c8f-f913-48b2-9bd0-7fc13eda0fd0
analysisStorage.name                  Small
analysisStorage.ownerId               8ec463f6-1acb-341b-b321-043c39d8716a
analysisStorage.tenantId              f91bb1a0-c55f-4bce-8014-b2e60c0ec7d3
analysisStorage.tenantName            ica-cp-admin
analysisStorage.timeCreated           2021-11-05T10:28:20Z
analysisStorage.timeModified          2021-11-05T10:28:20Z
endDate                               2022-03-10T21:00:33Z
id                                    461d3924-52a8-45ef-ab62-8b2a29621021
ownerId                               7fa2b641-1db4-3f81-866a-8003aa9e0818
pipeline.analysisStorage.description  1.2 TB
pipeline.analysisStorage.id           6e1b6c8f-f913-48b2-9bd0-7fc13eda0fd0
pipeline.analysisStorage.name         Small
pipeline.analysisStorage.ownerId      8ec463f6-1acb-341b-b321-043c39d8716a
pipeline.analysisStorage.tenantId     f91bb1a0-c55f-4bce-8014-b2e60c0ec7d3
pipeline.analysisStorage.tenantName   ica-cp-admin
pipeline.analysisStorage.timeCreated  2021-11-05T10:28:20Z
pipeline.analysisStorage.timeModified 2021-11-05T10:28:20Z
pipeline.code                         cli-tutorial
pipeline.description                  Test, prepared parameters file from working GUI
pipeline.id                           6779fa3b-e2bc-42cb-8396-32acee8b6338
pipeline.language                     CWL
pipeline.ownerId                      7fa2b641-1db4-3f81-866a-8003aa9e0818
pipeline.tenantId                     d0696494-6a7b-4c81-804d-87bda2d47279
pipeline.tenantName                   icav2-entprod
pipeline.timeCreated                  2022-03-10T13:13:05Z
pipeline.timeModified                 2022-03-10T13:13:05Z
reference                             tut-test-cli-tutorial-eda7ee7a-8c65-4c0f-bed4-f6c2d21119e6
startDate                             2022-03-10T20:42:42Z
status                                SUCCEEDED
summary                               
tenantId                              d0696494-6a7b-4c81-804d-87bda2d47279
tenantName                            icav2-entprod
timeCreated                           2022-03-10T20:42:42Z
timeModified                          2022-03-10T21:00:33Z
userReference                         tut-test

The pipelines can be run using JSON input type as well. The following is an example of running pipelines using JSON input type. Note that JSON input works only with file-based CWL pipelines (built using code, not a graphical editor in ICA).

 % icav2 projectpipelines start cwl cli-tutorial --data-id fil.c23246bd7692499724fe08da020b1014 --input-json '{
  "ipFQ": {
    "class": "File",
    "path": "test.fastq"
  }
}' --type-input JSON --user-reference tut-test-json

Notes

runtime.ram and runtime.cpu

runtime.ram and runtime.cpu values are by default evaluated using the compute environment running the host CWL runner. CommandLineTool Steps within a CWL Workflow run on different compute environments than the host CWL runner, so the valuations of the runtime.ram and runtime.cpu for within the CommandLineTool will not match the runtime environment the tool is running in. The valuation of runtime.ram and runtime.cpu can be overridden by specifying cpuMin and ramMin in the ResourceRequirements for the CommandLineTool.

To add a custom or public docker image to the ICA repository, please refer to the .

Before you can use ICA CLI, you will need to authenticate using the Illumina API key. Please follow to authenticate.

Upload data to the project using the "icav2 projectdata upload" command. Please refer to the for advanced data upload features. For this test, we will use a small FASTQ file test.fastq containing the following reads.

these instructions
user guide
github
Docker Repository
these instructions
Data page