API

The ICA APIs are hosted via a swagger-based UI in accordance to the OpenAPI 3.0 specification.

API Reference

API Beginner Guide

CORS Support

The APIs support Cross-origin resources sharing (CORS).

Cursor- versus Offset-based Pagination

Cursor-based pagination is fast, but does not support sorting, while offset-based pagination is slower, but supports sorting. If no pagination method is selected in the API call, the sort parameter is used to choose the default pagination.

  • When no sorting is set, cursor based pagination will be used by default.

  • When sorting is requested, offset-based pagination is used.

Cursor-based pagination

Uses a pointer to a specific record and then returns the records following this record. It is able to handle data updates without skipping records or displaying them twice and at a faster speed, but can only handle small data jumps. There is also no easy option to sort or no way to skip pages, so if you want page X, all pages leading up to X must also be requested because each page contains the pointer to the next page.

Cursor-based pagination uses the parameter pageSize (amount of rows to return) in combination with pageToken (the cursor to get subsequent results). The parameter remainingRecords shows the number of records after the current cursor which are not shown. remainingRecords is an approximation only, as this number can increase or decrease while you are paging because records might be added or removed.

Example

  • A list of projects, with 2 projects returned per page (enter your API key in the YOUR_API_KEY part of the expression)

  • 'https://ica.illumina.com/ica/rest/api/projects?includeHiddenProjects=false&pageSize=2' \
    -H 'accept: application/vnd.illumina.v3+json' \
    -H 'X-API-Key: YOUR_API_KEY'````
    
  • The response will contain a pointer to the next page "nextPageToken": "A_STRING_OF_LETTERS_AND_NUMBERS"

  • You then use that pointer (the string of numbers and letters) to get the next 2 entries by replacing A_STRING_OF_LETTERS_AND_NUMBERS by the returned next page token

curl -X 'GET' \
  'https://ica.illumina.com/ica/rest/api/projects?includeHiddenProjects=false&pageToken=A_STRING_OF_LETTERS_AND_NUMBERS_HERE&pageSize=2' \
  -H 'accept: application/vnd.illumina.v3+json' \
  -H 'X-API-Key: YOUR_API_KEY'

Offset-based pagination

Returns records based on their position in a table. The data set is divided into pages and the offset parameter is used to determine which page to display. Offset pagination can quickly perform large jumps in data because you do not need to query previous pages and it allows custom sorting. The downside is that offset-based pagination is susceptible to missing records or displaying records twice when there have been updates while it is paginating.

In ICA, offset-based pagination is limited to 200.000 rows, and does not guarantee unique results across all pages as data can be updated during pagination.

Offset pagination uses the parameter pageSize (amount of rows to return) in combination with pageOffset (amount of rows to skip) and totalItemCount (The total number of records matching the search criteria). The larger your page size is, the slower the response time.

Example

  • GET /api/projects/{projectId}/analyses with pageOffset to 0. This will return the totalItemCount.

  • If you want to sort the results, you can use reference descfor descending)

curl -X 'GET' \'<SERVER>/ica/rest/api/projects/<PROJECT_ID>/analyses?pageOffset=2&pageSize=2&sort=reference%20desc' \ -H 'accept: application/vnd.illumina.v3+json' \ -H 'X-API-Key: <AUTH_KEY>'

Starting a Pipeline

If you want to start a pipeline, you need an activation code and the analysis storage.

  • Activation codes are tokens which allow you to run your analyses and are used for accounting. ICA will automatically determine the best matching activation code, but this can be overwritten if needed.

  • Analysis storage is used to determine the required storage size for your data and results. If this size is too small, the workflow will fail because of insufficient storage space, if it is too large, you are charged for storage which is not needed. In general, it is best to use the settings advised by the workflow designer.

For more details, use the API Reference.

  1. To obtain the analysis storage, perform GET /api/analysisStorages to obtain the list of available storages and copy the id of the size which you want to use for your pipeline.

  2. ICA determines the best matching activation code.

  3. Do POST /api/projects/{projectId}/analysis:cwl or POST /api/projects/{projectId}/analysis:nextflow with the obtained value for analysis storage.

File output behavior

For advanced output mappings, the overwrite behavior for file output can be set with the API parameter actionOnExist. This is a string, which supports the following options if a file or folder by that name already exists at that location.

  • Overwrite (default) : The existing file is overwritten.

  • Rename : The file is renamed by appending an incremental counter, before the extension.

  • Skip : The new file will not be uploaded.

Using data from your own S3 Storage

Prerequisite : have your S3 bucket and credentials ready (public S3 buckets without credentials are not supported)

  1. (option 1) If you do not have credentials, then from the API perform POST /api/storageCredentials to store the storage credentials of your S3 bucket. The response will indicate the id (uuid) of the credentials.

  2. (option 2) If you already have credentials which you want to use, perform a GET /api/storageCredentials to retrieve the list of storage credentials and note the id (uuid) which has been assigned to the credentials.

  3. From the API perform a POST /api/projects/{projectId}/analysis:cwl (or Nextflow) with the required parameters for your analysis and

  • CreateCwlAnalysis > CwlanalysisInput > CwlAnalysisStructuredInput > AnalysisDataInput > AnalysisInputExternalData > url : s3://your_S3_address

  • CreateCwlAnalysis > CwlanalysisInput > CwlAnalysisStructuredInput > AnalysisDataInput > AnalysisInputExternalData > type : pattern: s3

  • CreateCwlAnalysis > CwlanalysisInput > CwlAnalysisStructuredInput > AnalysisDataInput > AnalysisInputExternalData > AnalysisS3DataDetails > storageCredentialsId: uuid from step 2

  • CreateCwlAnalysis > CwlanalysisInput > CwlAnalysisStructuredInput > AnalysisDataInput > AnalysisInputExternalData > mountPath : location where the input file will be located on the machine that is running the pipeline. Preferably use a relative path, though an absolute path is allowed. The path must end with the file name which may differ from the original input data.

Rate Limiting

The ICA API contains a rate limiter to prevent excessive requests from interfering with normal operations. If you get error 429 Too many requests, then your requests have exceeded the rate limit. To prevent this, please apply the following design principles during pipeline creation.

  • Implement exponential backoff instead of a fixed retry time. Exponential backoff increases the delay between each retry attempt to increase the chance of the retry attempt succeeding without flooding the system with requests.

  • Determine for each API call in your design if it is really necessary.

  • If all else fails, contact Illumina technical support to help optimize your pipeline design and API requests.

Duplicate Files on SampleCreationBatch

When running POST /api/projects/{projectId}/sampleCreationBatch, the input is checked to see if there are any duplicate files. The file in the lowest-level folder will be considered as the original file and duplicates in higher-level folders or the root folder will be ignored as input. Suppose you have the files main\file1, main\file2 and main\folder1\file1. Then, the input will be main\file2 because it has no duplicate, and main\folder1\file1 because, even though it is a duplicate file, this instance is in the lowest level folder. main\file1 will not be used because it is a duplicate in a higher-level folder.

Last updated