Analysis Auto-launch

Sequencing runs may be configured during run planning to automicatally launch cloud analyses upon the run's completion. Sequencing run management including run planning, run monitoring, and viewing results is performed via BaseSpace Sequence Hub. The post-sequencing secondary analysis (ie, demultiplexing, DRAGEN) is powered by ICA pipelines. The analysis results are stored in an /ilmn-analyses/<sequencer_run_output> subfolder.

  • Basespace is used for sequencing run planning and monitoring

  • ICA is used to store sequencing run data and perform secondary analysis

BaseSpace must be configured to leverage ICA for run storage. See BaseSpace Settings for details.

  1. Run Planning - performed in BaseSpace to configure a sequencing run including choosing instrument type, analysis, and sample settings

  2. Start Sequencing Run - performed on the sequencing instrument to select the planned run and launch the run

  3. Data Upload and Run Monitoring

    • Data Upload - the sequencing instrument uploads the sequencing run output data to ICA

    • Run Monitoring - sequencing run monitoring is available in BaseSpace

  4. "Auto launch" Analysis - the secondary analysis is automatically launched in ICA upon sequencing run completion

  5. Auto launch Analysis Monitoring - the secondary analysis status is monitored in BaseSpace

  6. Edit & Requeue SampleSheet to trigger #4 (optional) - if an error occurs during secondary analysis, the analysis may be requeued (re-launched) from BaseSpace

Further details about each step above is described in the sections below.

Sequencing Run Planning

Sequencing run planning in BaseSpace Sequence Hub is the first step in preparing a sequencing run with auomated post-sequencing analysis. See the BaseSpace Plan Runs documentation for details on how to use run planning.

A sequencing run can be started from the instrument using either:

  1. Planned sequencing run setup in Basespace Run Planning (recommended)

  2. Sample Sheet imported during the on-instrument Run Setup

Refer to Sequencer Reference for details.

Sample Sheet

The orchestration of auto-launched ICA pipelines is driven by information provided in the sequencing run sample sheet. When leveraging BaseSpace Run Planning to plan a run, the sample sheet is automatically generated. The sample sheet may also be attached to the run manually on the sequencer. The following sections are necessary for powering the autolaunch of secondary analysis after the sequencing run completes.

To retreive the most up-to-date sample sheet template for your instrument, generate an example using BaseSpace Run Planning.

The FileFormatVersion in the [Header] section must be set to 2 to indicate the sample sheet as v2 format.

[Header],
FileFormatVersion,2

Sample sheets generated using BaseSpace Run Planning may contain additional optional header fields such as RunName, InstrumentPlatform, and IndexOrientation.

Primary Analysis Settings

The primary analysis settings consist of the information used for the on-instrument sequencing and demultiplexing of raw sequencing output data. This includes information about reads, indices, primers, etc.

These sections vary greatly depending on the instrument type used and sequencing configuration. The below examples are used for demonstration purposes only.

[Reads],
Read1Cycles,1
Read2Cycles,1
Index1Cycles,1
Index2Cycles,1
[Sequencing],
CustomRead1Primer,false
CustomRead2Primer,false
CustomIndex1Primer,false
CustomIndex2Primer,false
[BCLConvert_Settings],
SoftwareVersion,0
BarcodeMismatchesIndex1,
BarcodeMismatchesIndex2,
AdapterRead1,
AdapterRead2,
OverrideCycles,

For demultiplexing the raw sequencing output data, the settings for demultiplexing (ie, BCL Convert) are included. The below examples are used for demonstration purposes only.

[BCLConvert_Settings],
Sample_ID,NS001
Index,T
Index2,A
Lane,1

Secondary Analysis Settings

A [Cloud_<pipeline_name>_Settings] and [Cloud_<pipeline_name>_Data] setting are provided for each secondary analysis pipeline to be autolaunched after sequencing. The value used for <pipeline_name> can be any value. It's best practice to refer to the analysis type such as "DragenGermline", "BCLConvert", etc.These sections provide input parameters to the pipelines. The contents of these sections, including the columns, may vary depending on the pipeline used for secondary analysis. The below snippet demonstrates using "DragenGermline" as an example.

Refer to Secondary Analysis Reference for secondary analysis URNs.

[Cloud_DragenGermline_Settings]
SoftwareVersion,4.1.5
MapAlignOutFormat,bam
 
[Cloud_DragenGermline_Data]
Sample_ID,ReferenceGenomeDir,VariantCallingMode
<sample_id>,urn:ilmn:ica:region:<region_guid>:data:<data_guid>#<data_path>,None
<sample_id>,urn:ilmn:ica:region:<region_guid>:data:<data_guid>#<data_path>,None

The [Cloud_Settings] section must be present and include:

  1. Cloud_Workflow value set to ica_workflow_1

  2. BCLConvert_Pipeline set a valid ICA Uniform Resource Name (URN) for demultiplexing

  3. [Optional] One or more Cloud_<pipeline_code>_Pipeline entries set to a valid ICA URN for secondary analysis

[Cloud_Settings]
GeneratedVersion,0.0.0
Cloud_Workflow,ica_workflow_1
BCLConvert_Pipeline,urn:ilmn:ica:pipeline:<pipeline_uuid>#<pipeline_code>
Cloud_DragenGermline_Pipeline,urn:ilmn:ica:pipeline:<pipeline_uuid>#<pipeline_code>

The `<pipeline_name>` used in the `[Cloud_<pipeline_name>_Settings]` and `[Cloud_<pipeline_name>_Data]` much exactly match the value used in the `[Cloud_Settings]` section for the `[Cloud_<pipeline_name>_Pipeline]`.

Sequencing Run Monitoring

Monitoring the status of an ongoing sequencing run is done through BaseSpace. See the View Runs BaseSpace help documentation for more information.

Automated Secondary Analysis

When the sequencing run data upload has completed, the post-sequencing secondary analysis described in the sample sheet will be automatically started.

In ICA, secondary analyses launched after a sequencing run completes is orchestrated by a parent process called a Workflow Session. The workflow session serves as a parent process orchestrating the secondary analysis, including launching secondary analysis pipelines and creating sample entities in ICA.

Upon completion of the sequencing run data upload, a workflow session record is created and visible in the ICA "Analyses" view.

The workflow session details view includes a section Orchestrated Analyses to indicate analyses launched by the workflow session as part of the automated secondary analysis. These orchestrated analyses are driven by the information in the sample sheet, including the pipeline to launch, reference data, and sample-specific settings.

Currently, only a limited set of Illumina-provided pipelines are compatible with workflow session automation.

Analysis Requeue

The secondary analysis for a sequencing run may need to be reexecuted for a variety of reasons including:

  • Fix a Sample Sheet error

  • Intermittent analysis failure

The operation to manually relaunch the secondary analysis is referred to as "requeue". Requeues are performed from Basespace Sequence Hub. See Basespace Documentation for instructions.

Manual Launch Analysis

There may be situations when an analysis needs to be launched manually in ICA if the auto-launched analysis failed. Analyses can be launched manually in ICA with a few extra steps to prepare the input data.

For example, if the demultiplexing analysis fails due to a sample sheet mistake or a system error, the analysis can be launched manually in ICA by following these instructions:

  1. Create a new or choose an existing user-managed project to link the data to and run the analysis in

  2. Follow instructions to Link Project Data to link the sequencing run ouput data to the user-managed project data (See Sequencer Run Data for details on how to find the correct folder to link)

  3. Link the Bundle containing the ICA pipeline to the user-managed project (ie, "DRAGEN Analysis 4.1.5 - Sequencer Integration Only") (See Link Bundles)

  4. Fix the sample sheet by downloading the original samplesheet from the sequencing run output folder, modify, and re-upload the file to the user-managed project

  5. Launch the pipeline (ie, BclConvert_v4_1_5) with the corrected sample sheet as input

Using API to Launch Analysis

There may be situations when an analysis needs to be launched manually with the ICA API if the auto-launched analysis failed. Use the API to create an analysis and set the input type to basespace (for example /api/projects/{projectId}/analysis:nextflow with analysisInput > inputs > externalData > type:basespace) This works for both files and folders.

{
  "externalData": [
    {
      "url": "https://api.servername.illumina.com/v2/files/your_files",
      "type": "basespace",
      "mountPath": "/location_where_input_file_is_saved_on_machine_running_pipeline",
      "basespaceDetails": {
        "workgroupId": "wid:123" (Optional workgroup id to filter on)
        "extensions": "vcf" (Optional BSSH API query param to filter based on file extensions)
        "pathPrefix": "/path/to/files" (Optional BSSH query string param to filter files/folder)
      }
    }
  ]
}

Last updated