LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page
  • Configure a schedule
  • Files
  • Metadata
  • Administrative data
  • Delete schedule
  • Run schedule

Was this helpful?

Export as PDF
  1. Project
  2. Base

Schedule

On the Schedule page at Projects > your_project > Base > Schedule, it’s possible to create a job for importing different types of data you have access to into an existing table.

When creating or editing a schedule, Automatic import is performed when the Active box is checked. The job will run at 10 minute intervals. In addition, for both active and inactive schedules, a manual import is performed when selecting the schedule and clicking the »run button.

Configure a schedule

There are different types of schedules that can be set up:

  • Files

  • Metadata

  • Administrative data.

Files

This type will load the content of specific files from this project into a table. When adding or editing this schedule you can define the following parameters:

  • Name (required): The name of the scheduled job

  • Description: Extra information about the schedule

  • File name pattern (required): Define in this field a part or the full name of the file name or of the tag that the files you want to upload contain. For example, if you want to import files named sample1_reads.txt, sample2_reads.txt, … you can fill in _reads.txt in this field to have all files that contain _reads.txt imported to the table.

  • Generated by Pipelines: Only files generated by these selected pipelines are taken into account. When left clear, files from all pipelines are used.

  • Target Base Table (required): The table to which the information needs to be added. A drop-down list with all created tables is shown. This means the table needs to be created before the schedule can be created.

  • Write preference (required): Define data handling; whether it can overwrite the data

  • Data format (required): Select the data format of the files (CSV, TSV, JSON)

  • Delimiter (required): to indicate which delimiter is used in the delimiter separated file. If the delimiter is not present in list, it can be indicated as custom.

  • Active: The job will run automatically if checked

  • Custom delimiter: the custom delimiter that is used in the file. You can only enter a delimiter here if custom delimiter is selected.

  • Header rows to skip: The number of consecutive header rows (at the top of the table) to skip.

  • References: Choose which references must be added to the table

  • Advanced Options

    • Encoding (required): Select the encoding of the file.

    • Null Marker: Specifies a string that represents a null value in a CSV/TSV file.

    • Quote: The value (single character) that is used to quote data sections in a CSV/TSV file. When this character is encountered at the beginning and end of a field, it will be removed. For example, entering " as quote will remove the quotes from "bunny" and only store the word bunny itself.

    • Ignore unknown values: This applies to CSV-formatted files. You can use this function to handle optional fields without separators, provided that the missing fields are located at the end of the row. Otherwise, the parser can not detect the missing separator and will shift fields to the left, resulting in errors.

      • If headers are used: The columns that have matching fields are loaded, those that have no matching fields are loaded with NULL and remaining fields are discarded.

      • If no headers are used: The fields are loaded in order of occurrence and trailing missing fields are loaded with NULL, trailing additional fields are discarded.

Metadata

This type will create two new tables: BB_PROJECT_PIPELINE_EXECUTIONS_DETAIL and ICA_PROJECT_SAMPLE_META_DATA. The job will load metadata (added to the samples) into ICA_PROJECT_SAMPLE_META_DATA. The process gathers the metadata from the samples via the data linked to the project and the metadata from the analyses in this project. Furthermore, the schedular will add provenance data to BB_PROJECT_PIPELINE_EXECUTIONS_DETAIL. This process gathers the execution details of all the analyses in the project: the pipeline name and status, the user reference, the input files (with identifiers), and the settings selected at runtime. This enables you to track the lineage of your data and to identify any potential sources of errors or biases. So, for example, the following query will count how many times each of the pipelines was executed and sort it accordingly:

SELECT PIPELINE_NAME, COUNT(*) AS Appearances
FROM BB_PROJECT_PIPELINE_EXECUTIONS_DETAIL
GROUP BY PIPELINE_NAME
ORDER BY Appearances DESC;

To obtained the similar table for the failed runs, you can execute the following SQL query:

SELECT PIPELINE_NAME, COUNT(*) AS Appearances
FROM BB_PROJECT_PIPELINE_EXECUTIONS_DETAIL
WHERE PIPELINE_STATUS = 'Failed'
GROUP BY PIPELINE_NAME
ORDER BY Appearances DESC;

When adding or editing this schedule you can define the following parameters:

  • Name (required): the name of this scheduled job

  • Description: Extra information about the schedule

  • Anonymize references: when selected, the references will not be added

  • Include sensitive meta data fields: in the meta data fields configuration, fields can be set to sensitive. When checked, those fields will also be added.

  • Active: the job will run automatically if ticked

  • Source (Tenant Administrators Only):

    • Project (default): All administrative data from this project will be added

    • Account: All administrative data from every project in the account will be added. When a tenant admin creates the tenant-wide table with administrative data in a project and invites other users to this project, these users will see this table as well.

Administrative data

This type will automatically create a table and load administrative data into this table. A usage overview of all executions is considered administrative data.

When adding or editing this schedule the following parameters can be defined:

  • Name (required): The name of this scheduled job

  • Description: Extra information about the schedule

  • Anonymize references: When checked, any platform references will not be added

  • Include sensitive metadata fields: In the metadata fields configuration, fields can be set to sensitive. When checked, those fields will also be added.

  • Active: The job will run automatically if checked

  • Source (Tenant Administrators Only):

    • Project (default): All administrative data from this project will be added

    • Account: All administrative data from every project in the account will be added. When a tenant admin creates the tenant-wide table with administrative data in a project and invites other users to this project, these users will see this table as well.

Delete schedule

Schedules can be deleted. Once deleted, they will no longer run, and they will not be shown in the list of schedules.

Run schedule

When clicking the Run button, or Save & Run when editing, the schedule will start the job of importing the configured data in the correct tables. This way the schedule can be run manually. The result of the job can be seen in the tables.

PreviousQueryNextSnowflake

Last updated 1 month ago

Was this helpful?