# nf-core Pipelines

## Introduction

This tutorial shows you how to

* [Import any nf-core pipeline from their public repository.](#import-nf-core-pipeline-to-bench)
* [Run the pipeline in Bench.](#run-validation-test-in-bench)
  * [monitor](#monitoring) the execution
* [Deploy pipeline as an ICA Flow pipeline](#deploy-as-flow-pipeline).
* [Launch Flow validation test from Bench.](#run-validation-test-in-flow)

## Preparation

* Start Bench workspace
  * For this tutorial, the **instance size** depends on the flow you import, and whether you use a Bench cluster:
    * If using a **cluster**, choose *standard-small or standard-medium* for the workspace master node
    * **Otherwise**, choose at least *standard-large* as nf-core pipelines often need more than 4 cores to run.
  * Select the **single user workspace** permissions (aka "Access limited to workspace owner "), which allows us to deploy pipelines
  * Specify at least 100GB of **disk space**
* Optional: After choosing the image, **enable a cluster** with at least this one *standard-large*instance type
* **Start the workspace**, then (if applicable) start the cluster

## Import nf-core Pipeline to Bench

```
mkdir demo
cd demo
pipeline-dev import-from-nextflow nf-core/demo
```

If conda and/or nextflow are not installed, *pipeline-dev* will offer to install them.

The Nextflow files are pulled into the `nextflow-src` subfolder.

{% hint style="info" %}
A larger example that still runs quickly is *nf-core/sarek*
{% endhint %}

#### Result

```
/data/demo $ pipeline-dev import-from-nextflow nf-core/demo

Creating output folder nf-core/demo
Fetching project nf-core/demo

Fetching project info
project name: nf-core/demo
repository  : https://github.com/nf-core/demo
local path  : /data/.nextflow/assets/nf-core/demo
main script : main.nf
description : An nf-core demo pipeline
author      : Christopher Hakkaart

Pipeline “nf-core/demo” successfully imported into nf-core/demo.

Suggested actions:
  cd nf-core/demo
  pipeline-dev run-in-bench
  [ Iterative dev: Make code changes + re-validate with previous command ]
  pipeline-dev deploy-as-flow-pipeline
  pipeline-dev launch-validation-in-flow
```

## Run Validation Test in Bench

All nf-core pipelines conveniently define a "test" profile that specifies a set of validation inputs for the pipeline.

The following command **runs this test profile**. If a Bench cluster is active, it runs on your Bench cluster, otherwise it runs on the main workspace instance.

```
cd nf-core/demo
pipeline-dev run-in-bench
```

{% hint style="info" %}
The pipeline-dev tool is using "nextflow run ..." to run the pipeline. The full nextflow command is printed on stdout and can be copy-pasted+adjusted if you need additional options.
{% endhint %}

#### Result

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-ff716ec811eaa3fd1758cb5413bf016eaa9b003e%2Fimage%20(19).png?alt=media" alt=""><figcaption></figcaption></figure>

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-0e6a97bbeda3fcf988f21fd1d40f587ca99ba585%2Fimage%20(20).png?alt=media" alt=""><figcaption></figcaption></figure>

#### Monitoring

When a pipeline is running **locally** (i.e. not on a Bench cluster), you can monitor the task execution from another terminal with `docker ps`

When a pipeline is running on **your Bench cluster**, a few commands help to monitor the tasks and cluster. In another terminal, you can use:

* `qstat` to see the tasks being pending or running
* **`tail /data/logs/sge-scaler.log.`**`<latest available workspace reboot time>` to check if the cluster is scaling up or down (it currently takes 3 to 5 minutes to get a new node)

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-e32f1942a361a7adf7ad9dad19be11d06c254b7f%2Fimage%20(21).png?alt=media" alt=""><figcaption></figcaption></figure>

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-095c4c39d8cc184d69a84936a04e393f0757f20d%2Fimage%20(22).png?alt=media" alt=""><figcaption></figcaption></figure>

#### Data Locations

* The output of the pipeline is in the `outdir` folder
* Nextflow work files are under the `work` folder
* Log files are `.nextflow.log*` and `output.log`

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-5bf032fecee7e141953f193492aada2afb8382e0%2Fimage%20(23).png?alt=media" alt="" width="375"><figcaption></figcaption></figure>

## Deploy as Flow Pipeline

```
pipeline-dev deploy-as-flow-pipeline
```

After generating a few ICA-specific files (JSON input specs for Flow launch UI + list of inputs for next step's validation launch), the tool identifies which previous versions of the same pipeline have already been deployed (in ICA Flow, pipeline versioning is done by including the version number in the pipeline name, so that's what is checked here). It then asks if you want to update the latest version or create a new one.

Choose "3" and enter a name of your choice to avoid conflicts with other users following this same tutorial.

```
Choice: 3
Creating ICA Flow pipeline dev-nf-core-demo_v4
Sending inputForm.json
Sending onRender.js
Sending main.nf
Sending nextflow.config
```

At the end, the URL of the pipeline is displayed. If you are using a terminal that supports it, Ctrl+click or middle-click can open this URL in your browser.

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-88a2dfc521d8209a34ae1d23c35ce52fe2c807db%2Fimage%20(24).png?alt=media" alt="" width="375"><figcaption></figcaption></figure>

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-e7a9ccf0285a3160fd874f96aea76cc0763035e1%2Fimage%20(25).png?alt=media" alt=""><figcaption></figcaption></figure>

## Run Validation Test in Flow

```
pipeline-dev launch-validation-in-flow
```

This launches an analysis in ICA Flow, using the same inputs as the nf-core pipeline's "test" profile.

Some of the input files will have been copied to your ICA project to allow the launch to take place. They are stored in the folder `bench-pipeline-dev/temp-data.`

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-8fd646e5883e343ad6b106c13a57fa8a8fa5a1da%2Fimage%20(41).png?alt=media" alt=""><figcaption></figcaption></figure>

<figure><img src="https://3193631692-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MWUqIqZhOK_i4HqCUpT%2Fuploads%2Fgit-blob-87662c3b1c3e76529ffbf9fa074f2597778a5da4%2Fimage%20(18).png?alt=media" alt=""><figcaption></figcaption></figure>

## Hints

<details>

<summary>Using older versions of Nextflow</summary>

Some older nf-core flows are still using DSL1, which is only working up to Nextflow 22.

An easy solution is to create a conda environment for nextflow 22:

```
conda create -n nextflow22
 
# If, like me, you never ran "conda init", do it now:
conda init
bash -l # To load the conda's bashrc changes
 
conda activate nextflow22
conda install -y nextflow=22
 
# Check
nextflow -version
 
# Then use the pipeline-dev tools as in the demo
```

</details>
