Connect AWS S3 Bucket
Last updated
Last updated
You can use your own S3 bucket with Illumina Connected Analytics (ICA) for data storage. This section describes how to configure your AWS account to allow ICA to connect to an S3 bucket.
These instructions utilize the AWS CLI. Follow the AWS CLI documentation for instructions to download and install.
Key points for connected AWS S3 buckets to ICA:
The AWS S3 bucket must exist in the same AWS region as the ICA project. Refer to the table below for a mapping of ICA project regions to AWS regions:
*Note: BSSH is not deployed currently on the South Korea instance, therefore there will be limited functionality in this region with regard to sequencer integration.
You can enable SSE using an Amazon S3-managed key (SSE-S3). Instructions for using KMS-managed (SSE-KMS) keys are found here.
Because of how Amazon S3 handles folders and does not send events for S3 folders, the following restrictions must be taken into account for ICA project data stored in S3.
When creating an empty folder in S3, it will not be visible in ICA.
When moving folders in S3, the original, but empty, folder will remain visible in ICA and must be manually deleted there.
When deleting a folder and its contents in S3, the empty folder will remain visible in ICA and must be manually deleted there.
Projects cannot be created with ./ as prefix since S3 does not allow uploading files with this key prefix.
When configuring a new project in ICA to use a preconfigured S3 bucket, you can use the root folder if needed. However, this is not recommended as that S3 bucket is then no longer available for other ICA projects. Instead, please consider using subfolders in S3 for your projects.
❗️ For Bring Your Own Storage buckets, all unversioned, versioned and suspended buckets are supported. If you connect buckets with object versioning, the data in ICA will be automatically synced with the data in objectstore. For Bring Your Own Storage buckets with versioning enabled, when an object is deleted without specifying a particular version, a "Delete marker" is created on the objectstore to indicate that the object has been deleted. ICA will reflect the object state by deleting the record from the database. No further action on your side is needed to sync.
ICA requires cross-origin resource sharing (CORS) permissions to write to the S3 bucket for uploads via the browser. Refer to the Configuring cross-origin resource sharing (CORS) (expand the "Using the S3 console" section) documentation for instructions on enabling CORS via the AWS Management Console. Use the following configuration during the process:
In the cross-origin resource sharing (CORS) section, enter the following content.
ICA requires specific permissions to access data in an AWS S3 bucket. These permissions are contained in an AWS IAM Policy.
Refer to the Creating policies on the JSON tab documentation for instructions on creating an AWS IAM Policy via the AWS Management Console. Use the following configuration during the process:
On Unversioned buckets, paste the JSON policy document below. Note the example below provides access to all objects prefixes in the bucket.
Replace YOUR_BUCKET_NAME with the name of the S3 bucket you created for ICA.
On Versioned OR Suspended buckets, paste the JSON policy document below. Note the example below provides access to all objects prefixes in the bucket.
Replace YOUR_BUCKET_NAME with the name of the S3 bucket you created for ICA.
(Optional) Set policy name to "illumina-ica-admin-policy"
To create the IAM Policy via the AWS CLI, create a local file named illumina-ica-admin-policy.json
containing the policy content above and run the following command. Be sure the path to the policy document (--policy-document
) leads to the path where you saved the file:
An AWS IAM User is needed to create an Access Key for ICA to connect to the AWS S3 Bucket. The policy will be attached to the IAM user to grant the user the necessary permissions.
Refer to the Creating IAM users (console) documentation for instructions on creating an AWS IAM User via the AWS Management Console. Use the following configuration during the process:
(optional) Set user name to "illumina_ica_admin"
Select the Programmatic access option for the type of access
Select Attach existing policies directly when setting the permissions, and choose the policy created in Create AWS IAM Policy
(Optional) Retrieve the Access Key ID and Secret Access Key by choosing to Download .csv
To create the IAM user and attach the policy via the AWS CLI, enter the following command (AWS IAM users are global resources and do not require a region to be specified). This command creates an IAM user illumina_ica_admin
, retrieves your AWS account number, and then attaches the policy to the user.
If the Access Key information was retrieved during the IAM user creation, skip this step.
Refer to the Managing access keys (console) AWS documentation for instructions on creating an AWS Access Key via the AWS Console. See the "To create, modify, or delete another IAM user's access keys (console)" sub-section.
Use the below command to create the Access Key for the illumina_ica_admin IAM user. Note the SecretAccessKey
is sensitive and should be stored securely. The access key is only displayed when this command is executed and cannot be recovered. A new access key must be created if it is lost.
The AccessKeyId
and SecretAccessKey
values will be provided to ICA in the next step.
Connecting your S3 bucket to ICA does not require any additional bucket policies.
However, if a bucket policy is required for use cases beyond ICA, you need to ensure that the bucket policy supports the essential permissions needed by ICA without inadvertently restricting its functionality.
Here is one such example:
Be sure to replace the following fields:
YOUR_BUCKET_NAME: Replace this field with the name of the S3 bucket you created for ICA.
YOUR_ACCOUNT_ID: Replace this field with your account ID number.
YOUR_IAM_USER: Replace this field with the name of your IAM user created for ICA.
In this example, we have a restriction enabled on the bucket policy to disallow any kind of access to the bucket. However, there is an exception rule added for the IAM user that ICA is using to connect to the S3 bucket. The exception rule is allowing ICA to perform the above S3 action permissions necessary for ICA functionalities.
Additionally, the exception rule is applied to the STS federated user session principal associated with ICA. Since ICA leverages the AWS STS to provide temporary credentials that allow users to perform actions on the S3 bucket, it is crucial to include these STS federated user session principals in your policy's whitelist. Failing to do so could result in 403 Forbidden errors when users attempt to interact with the bucket's objects using the provided temporary credentials.
To connect your S3 account to ICA, you need to add a storage credential in ICA containing the Access Key ID and Access Key created in the previous step. From the ICA home screen, navigate to System Settings > Storage > Credentials and click the +New button to create a new storage credential.
Provide a name for the storage credentials, ensure the type is set to "AWS user" and provide the Access Key ID and Secret Access Key.
With the secret credentials created, a storage configuration can be created using the secret credential. Refer to the instructions to Create a Storage Configuration for details.
ICA uses AssumeRole to copy and move objects from a bucket in an AWS account to another bucket in another AWS account. To allow cross account access to a bucket, the following policy statements must be added in the bucket policy:
Be sure to replace the following fields:
ASSUME_ROLE_ARN: Replace this field with the ARN of the cross account role you want to give permission to. Refer to the table below to determine which region-specific Role ARN should be used.
YOUR_BUCKET_NAME: Replace this field with the name of the S3 bucket you created for ICA.
The ARN of the cross account role you want to give permission to is specified in the Principal. Refer to the table below to determine which region-specific Role ARN should be used.
The following are common issues encountered when connecting an AWS S3 bucket through a storage configuration
This error occurs when an existing bucket notification's event information overlap with the notifications ICA is trying to add. Amazon S3 event notification only allows overlapping events with non-overlapping prefix. Depending on the conflicts on the notifications, the error can be presented in any of the following:
Volume Configuration cannot be provisioned: storage container is already set up for customer's own notification
Invalid parameters for volume configuration: found conflicting storage container notifications with overlapping prefixes
Failed to update bucket policy: Configurations overlap. Configurations on the same bucket cannot share a common event type
To fix the issue:
In the Amazon S3 Console, review your current S3 bucket's notification configuration and look for prefixes that overlaps with your Storage Configuration's key prefix
Delete the existing notification that overlaps with your Storage Configuration's key prefix
ICA will perform a series of steps in the background to re-verify the connection to your bucket.
This error can occur when recreating a recently deleted storage configuration. To fix the issue, you have to delete the bucket notifications:
In the Amazon S3 Console select the bucket for which you need to delete the notifications from the list.
Choose properties
Navigate to the Event Notifications section and choose the check box for the event notifications with name gds:objectcreated, gds:objectremoved and gds:objectrestore and click Delete.
Wait 15 minutes for the storage to become available in ICA
If you do not want to wait 15 minutes, you can delete the current storage configuration, delete the bucket notifications in the bucket and create a new storage configuration.
ICA Project Region | AWS Region |
---|---|
Region | Role ARN |
---|---|
Error Type | Error Message | Description/Fix |
---|---|---|