nf-core Pipelines
Introduction
This tutorial shows you how to
monitor the execution
Preparation
Start Bench workspace
For this tutorial, the instance size depends on the flow you import, and whether you use a Bench cluster:
If using a cluster, choose standard-small or standard-medium for the workspace master node
Otherwise, choose at least standard-large as nf-core pipelines often need more than 4 cores to run.
Select the single user workspace permissions (aka "Access limited to workspace owner "), which allows us to deploy pipelines
Specify at least 100GB of disk space
Optional: After choosing the image, enable a cluster with at least this one standard-largeinstance type
Start the workspace, then (if applicable) start the cluster
Import nf-core Pipeline to Bench
If conda and/or nextflow are not installed, pipeline-dev will offer to install them.
The Nextflow files are pulled into the nextflow-src
subdirectory.
A larger example that still runs quickly is nf-core/sarek
Result
Run Validation Test in Bench
All nf-core pipelines conveniently define a "test" profile that specifies a set of validation inputs for the pipeline.
The following command runs this test profile. If a Bench cluster is active, it runs on your Bench cluster, otherwise it runs on the main workspace instance.
The pipeline-dev tool is using "nextflow run ..." to run the pipeline. The full nextflow command is printed on stdout and can be copy-pasted+adjusted if you need additional options.
Result
Monitoring
When a pipeline is running locally (i.e. not on a Bench cluster), you can monitor the task execution from another terminal with docker ps
When a pipeline is running on your Bench cluster, a few commands help to monitor the tasks and cluster. In another terminal, you can use:
qstat
to see the tasks being pending or runningtail /data/logs/sge-scaler.log.
<latest available workspace reboot time>
to check if the cluster is scaling up or down (it currently takes 3 to 5 minutes to get a new node)
Data Locations
The output of the pipeline is in the
outdir
directoryNextflow work files are under the
work
directoryLog files are
.nextflow.log*
andoutput.log
Deploy as Flow Pipeline
After generating a few ICA-specific files (JSON input specs for Flow launch UI + list of inputs for next step's validation launch), the tool identifies which previous versions of the same pipeline have already been deployed (in ICA Flow, pipeline versioning is done by including the version number in the pipeline name, so that's what is checked here). It then asks if you want to update the latest version or create a new one.
Choose "3" and enter a name of your choice to avoid conflicts with other users following this same tutorial.
At the end, the URL of the pipeline is displayed. If you are using a terminal that supports it, Ctrl+click or middle-click can open this URL in your browser.
Run Validation Test in Flow
This launches an analysis in ICA Flow, using the same inputs as the nf-core pipeline's "test" profile.
Some of the input files will have been copied to your ICA project to allow the launch to take place. They are stored in the folder bench-pipeline-dev/temp-data.
Hints
Last updated
Was this helpful?