For the complete documentation index, see llms.txt. This page is also available as Markdown.

Launching a DRAGEN Pipeline

This guide covers launching, monitoring, and debugging DRAGEN pipelines using the DRAGEN Germline Whole Genome pipeline as example.

Prerequisites

Before launching the pipeline, ensure you have the following in place:

  • ICA Project — You must have an existing project in ICA. If you need to create a new project, follow the insctructions described in Projects.

  • DRAGEN Bundle — The DRAGEN bundle must be linked to your project. See Linking Bundles. This provides bundled references, pipelines, and demo data.

  • Input Data — Upload your sequencing data (FASTQ, ORA, BAM, or CRAM files) to your project, or use the demo data provided in "Illumina DRAGEN Germline Demo Data."

Launching via the ICA GUI

1

Start a New Analysis

  1. Navigate to Projects > your_project > Flow > Pipelines.

  2. Select DRAGEN_Germline_Whole_Genome.

  3. (Optional) Read the Pipeline Documentation page to find out more information about the pipeline, including its changelog and additional resources.

  4. Click Start analysis.

  5. Enter a User Reference (a meaningful name for this analysis run) and select a Subscription from the Pricing drop-down.

2

Configure Inputs

Select your Input Type (FASTQ GZ, FASTQ ORA, BAM, or CRAM) and provide your input files:

  • FASTQs / ORAs — Select your sequencing files. Multiple samples may be provided. The pipeline automatically parses filenames to determine FASTQ pairs and sample groupings. The RGSM is taken from the filename up to _SX and the suffix after _RX. _LXXX denotes the lane number and _RX denotes the read number.

To override the auto-detected sample names, you can provide a FASTQ List CSV containing the filenames with your own specified RGSM values.

  • BAMs / CRAMs — Select your alignment files. Map/Align can be turned off if realignment is not desired.

Select a Reference genome from the drop-down. The default is Homo sapiens [1000 Genomes] hg38 v6 Pangenome. Expand the drop-down for the full list of bundled references, or select Custom and provide your own reference hash table.

3

Configure Analysis Options

The input form provides options that vary by pipeline. Common sections include Map/Align, Variant Calling, CNV, SV, Variant Annotation, and Advanced Settings. Some pipelines also expose sections such as UMI, HLA Typing, Methylation, Fingerprint Checking, Targeted Callers, or Beta Features. Each field includes built-in help text describing its purpose and valid values.

For a standard WGS germline run, the defaults are suitable for most use cases — see Analysis Settings. Review and adjust any options as needed for your experiment.

4

Launch

Review your settings and click Start analysis.

Launching the Pipeline via CLI

If you have the CLI installed on your system, you can also use the commands below to work with pipelines. If you do not have an active CLI, please follow these instructions first.

For any icav2 CLI command, you can append --help to see a list of available optional settings.

Accessing your Pipeline

  1. List your projects with

icav2 projects list
  1. Enter your project context (replace your_project_name with the actual listed name of your project)

  1. List the pipelines in your project with

  1. List the analyses inputs with the pipeline uuid, (not with the pipeline name).

Minimal Example

JSON Pipelines are started with the following command:

To retrieve the ICA file IDs for your input files, use:

If you do not know the exact filename, you can search for files in your project with the command

The command below launches a germline analysis with FASTQ inputs and the default reference, relying on form defaults for all other settings:

Key CLI Parameters

Field ID
Example Value
Notes

fastqs

<file-id>

Provide ICA file IDs for FASTQ inputs.

reference

"hg38_alt_masked_graph_v6"

Expand the drop-down in the UI for available values.

Omitted fields with defaults (e.g., enable_map_align, enable_variant_caller, enable_cnv, enable_sv, output_format, enable_dragen_reports) are automatically applied from the form definition.

The icav2 projectpipelines input command does not necessarily return all available fields. To discover the full set of available parameters, view the input form JSON from the pipeline's UI page in ICA.

Some older or less commonly used pipelines use an XML-based input definition rather than nextflowjson. To launch these pipelines via the CLI, use the nextflow subcommand instead. Run icav2 projectpipelines start nextflow --help for usage details, as the parameter conventions differ. One key difference is that XML pipelines take a ref_tar input for the reference, where the user must provide the reference hash table as a file included in the DRAGEN bundle. See Analysis Settings for more details.

Monitoring and Viewing Outputs

Monitoring Analysis Status

  1. Navigate to Projects > your_project > Flow > Analyses.

  2. Click the refresh button to update the status.

  3. Click on a run to view details. The Details tab shows configuration, the Nextflow execution tab shows workflow progress, and the Steps tab shows logs (enable "Show technical steps" for additional log files).

The analysis status can also be monitored via the CLI:

The id corresponds to the id field returned in the projectpipelines start command.

For more details on analysis states, see Analysis Lifecycle.

If the analysis failed, look at the Debugging section to figure out what to do.

Viewing Outputs

Analysis outputs can be viewed by navigating to the analysis page in the GUI.

Report Tab

Most DRAGEN pipelines show an analysis report in the Report tab, unless it is disabled. The left-hand panel contains a Summary section with the overall report.html, as well as a Samples section listing individual per-sample reports. Selecting the summary report displays an interactive DRAGEN Reports page with tabs for key metrics, such as Summary, Enrichment, Trimmer, QC, Mapping, Coverage, and Variants. Selecting a sample report shows the same breakdowns for that sample.

Output Files Tab

The Output files tab lists all files produced by the analysis. Smaller files can be downloaded directly from the browser, while larger files such as BAMs and VCFs should be downloaded via the CLI.

Output JSON

The output includes an output.json file with two top-level sections:

  • summary — Counts of completed, failed, and total samples for the run.

  • samples — A per-sample map keyed by sample name. Each entry includes the sample's processing status and analysis info, such as the reference genome and other DRAGEN options used.

This file is useful for reproducing the analysis, auditing the parameters that were applied, or programmatically checking which samples succeeded or failed.

Analysis File Outputs

Typical analysis file outputs might include:

  • Alignment files (BAM/CRAM) with indexes

  • VCF/GVCF files for small variants, CNVs, SVs, and STRs, where applicable and enabled

  • Targeted caller reports (if enabled)

  • QC metrics and coverage reports (if enabled)

  • report.html (if DRAGEN Reports is enabled)

If you have failed samples, you may notice that they do not appear in the report, have no output files, or have a status of "Failed" in the output.json. Refer to the Debugging section for how to debug failed samples.

Debugging

When you encounter a failed analysis, there are a few things to look for. The "Error" field on the main analysis UI page will, most of the time, give you a hint about the kind of error encountered. For multi-sample analyses, the output.json gives you summary statuses and the status for each sample. If the information above is insufficient, you can dig deeper into the process and pipeline runner logs.

Finding the Failing Process

After identifying a failed analysis in Projects > your_project > Flow > Analyses, navigate to the Steps tab of the analysis. A failing process will be marked with a non-zero exit code.

Finding the DRAGEN Command

Knowing the exact DRAGEN command that was executed is useful for debugging as well as for reproducing an analysis outside of the pipeline.

For DRAGEN processes, executed commands are logged in the stderr. DRAGEN commands start with /opt/edico/bin/dragen. The command can also be found in the stdout with the format:

Multiple samples may run in a single process depending on the samples_per_node input. A failed sample may not terminate the process, so failures can appear in the middle of the stdout log rather than at the end.

Finding the Pipeline Runner Log

If no failing processes are visible, click the Show technical steps checkbox to reveal additional steps, including the pipeline runner stdout. Expand the pipeline_runner.0 stdout to see Nextflow's own log messages, which will indicate which process failed and why.

Analysis Settings

For most DRAGEN pipelines, the defaults are a good place to start. For more information on any of the input parameters beyond the parameter description, refer to the DRAGEN User Guide.

For a given published pipeline version, using the same set of parameters with a set of input data will give identical results, ensuring analyses are reproducible. Across different DRAGEN and/or pipeline versions, however, results may differ due to algorithmic improvements or the addition or removal of features. The best way to ensure the analysis is performed as similarly as possible across different versions is to check the DRAGEN options, which can be found in the output.json or the stdout of the DRAGEN process. Refer to the Debugging section for more details on where to locate the stdout.

Common Fields

Input FASTQs / ORAs — Pipelines will attempt to parse sample IDs from FASTQ filenames. To ensure that files are matched to the correct sample ID, users may optionally supply a FASTQ list to specify the structure. See the description in the input field for more details.

samples_per_node — This setting can be tweaked to optimize analysis runtime. For WGS samples, it is recommended to keep this at 1 sample per node. For exome or smaller panel samples, users can set it to 5 or higher.

Storage Size — Select a storage size equivalent to 2x the size of your input FASTQs (assuming BAM outputs).

ref_tar — When supplying a custom reference to a JSON pipeline or any reference to an XML pipeline, ensure that the DRAGEN hash table version matches the DRAGEN version used by the pipeline. A mismatched hash table will cause the analysis to fail.

Additional Resources

Last updated

Was this helpful?