CWL DRAGEN Pipeline

In this tutorial, we will demonstrate how to create and launch a DRAGEN pipeline using the CWL language.

In ICA, CWL pipelines are built using tools developed in CWL. For this tutorial, we will use the "DRAGEN Demo Tool" included with DRAGEN Demo Bundle 3.9.5.

Linking bundle to Project

1.) Start by selecting a project at the Projects inventory.

2.) In the details page, select Edit.

3.) In the edit mode of the details page, click the + button in the LINKED BUNDLES section.

4.) In the Add Bundle to Project window: Select the dragen demo tool bundle from the list. Once you have selected the bundle, the Link Bundles button becomes available. Select it to continue.

Tip: You can select multiple bundles using Ctrl + Left mouse button or Shift + Left mouse button.

5.) In the project details page, the selected bundle will appear under the LINKED BUNDLES section. If you need to remove a bundle, click on the - button. Click Save to save the project with linked bundles.

Create Pipeline

1.) From the project details page, select Pipelines > CWL

2.) You will be given options to create pipelines using a graphical interface or code. For this tutorial, we will select Graphical.

3.) Once you have selected the Graphical option, you will see a page with multiple tabs. The first tab is the Information page where you enter pipeline information. You can find the details for different fields in the tab in the GitBook. The following three fields are required for the INFORMATION page.

  • Code: Provide pipeline name here.

  • Description: Provide pipeline description here.

  • Storage size: Select the storage size from the drop-down menu.

4.) The Documentation tab provides options for configuring the HTML description for the tool. The description appears in the tool repository but is excluded from exported CWL definitions.

5.) The Definition tab is used to define the pipeline. When using graphical mode for the pipeline definition, the Definition tab provides options for configuring the pipeline using a visualization panel (A) and a list of component menus (B). You can find details on each section in the component menu here

6.) To build a pipeline, start by selecting Machine PROFILE from the component menu section on the right. All fields are required and are pre-filled with default values. Change them as needed.

  • The profile Name field will be updated based on the selected Resource. You can change it as needed.

  • Color assigns the selected color to the tool in the design view to easily identify the machine profile when more than one tool is used in the pipeline.

  • Tier lets you select Standard or Economy tier for AWS instances. Standard is on-demand ec2 instance and Economy is spot ec2 instance. You can find the difference between the two AWS instances here. You can find the price difference between the two Tiers here.

  • Resource lets you choose from various compute resources available. In this case, we are building a DRAGEN pipeline and we will need to select a resource with FPGA in it. Choose from FPGA resources (FPGA Medium/Large) based on your needs.

7.) Once you have selected the Machine Profile for the tool, find your tool from the Tool Repository at the bottom section of the component menu on the right. In this case, we are using the DRAGEN Demo Tool. Drag and drop the tool from the Tool Repository section to the visualization panel.

8.) The dropped tool will show the machine profile color, number of outputs and inputs, and warning to indicate missing parameters, mandatory values, and connections. Selecting the tool in the visualization panel activates the tool (Dragen Demo Tool) component menu. On the component menu section, you will find the details of the tool under Tool - DRAGEN Demo Tool. This section lists the inputs, outputs, additional parameters, and the machine profile required for the tool. In this case, the DRAGEN Demo Tool requires three inputs (FASTQ read 1, FASTQ read 2, and a Reference genome). The tool has two outputs (a VCF file and an output directory). The tool also has a mandatory parameter (Output File Prefix). Enter the value for the input parameter (Output File Prefix) in the text box.

9.) The top right corner of the visualization panel has icons to zoom in and out in the visualization panel followed by three icons: ref, in, and out. Based on the type of input/output needed, drag and drop the icons into the visualization area. In this case, we need three inputs (read 1, read 2, and Reference hash table.) and two outputs (VCF file and output directory). Start by dragging and dropping the first input (a). Connect the input to the tool by clicking on the blue dot at the bottom of the input icon and dragging it to the blue dot representing the first input on the tool (b). Select the input icon to activate the input component menu. The input section for the first input lets you enter the Name, Format, and other relevant information based on tool requirements. In this case, for the first input, enter the following information:

  • Name: FASTQ read 1

  • Format: FASTQ

  • Comments: any optional comments

10.) Repeat the step for other inputs. Note that the Reference hash table is treated as the input for the tool rather than Reference files. So, use the input icon instead of the reference icon.

11.) Repeat the process for two outputs by dragging and connecting them to the tool. Note that when connecting output to the tool, you will need to click on the blue dot at the bottom of the tool and drag it to the output.

12.) Select the tool and enter additional parameters. In this case, the tool requires Output File Prefix. Enter demo_ in the text box.

13.) Click on the Save button to save the pipeline. Once saved, you can run it from the Pipelines page under Flow from the left menus as any other pipeline.

Last updated