Tables

All tables created within Base are gathered on the Projects > your_project > Base > Tables page. New tables can be created and existing tables can be updated or deleted here.

Create a new Table

To create a new table, click Add table > New table on the Tables page. Tables can be created from scratch or from a template that was previously saved.

If you make a mistake in the order of columns when creating your table, then as long as you have not saved your table, you can switch to Edit as text to change the column order. The text editor can swap or move columns whereas the built-in editor can only delete columns or add columns to the end of the sequence. When editing in text mode, it is best practice to copy the content of the text editor to a notepad before you make changes because a corrupted syntax will result in the text being wiped or reverted when switching between text and non-text mode.

Once a table is saved it is no longer possible to edit the schema, only new fields can be added. The workaround is switching to text mode, copying the schema of the table to which you want to make modifications and paste it into a new empty table where the necessary changes can be made before saving.

Once created, do not try to modify your table column layout via the Query module as even though you can execute ALTER TABLE commands, the definitions and syntax of the table will go out of sync resulting in processing issues.

Empty Table

To create a table from scratch, complete the fields listed below and click the Save button. Once saved, a job will be created to create the table. To view table creation progress, navigate to the Activity page.

Table information

The table name is a required field and must be unique. The first character of the table must be a letter followed by letters, numbers or underscores. The description is optional.

References

Including or excluding references can be done by checking or un-checking the Include reference checkbox. These reference fields are not shown on the table creation page, but are added to the schema definition, which is visible after creating the table (Projects > your_project > Base > Tables > your_table > Schema definition). By including references, additional columns will be added to the schema (see next paragraph) which can contain references to the data on the platform:

  • data_reference: reference to the data element in the Illumina platform from which the record originates

  • data_name: original name of the data element in the Illumina platform from which the record originates

  • sample_reference: reference to the sample in the Illumina platform from which the record originates

  • sample_name: name of the sample in the Illumina platform from which the record originates

  • pipeline_reference: reference to the pipeline in the Illumina platform from which the record originates

  • pipeline_name: name of the pipeline in the Illumina platform from which the record originates

  • execution_reference: reference to the pipeline execution in the Illumina platform from which the record originates

  • account_reference: reference to the account in the Illumina platform from which the record originates

  • account_name: name of the account in the Illumina platform from which the record originates

Schema

In an empty table, you can create a schema by adding a field for each column of the table and defining it. The + Add field button is located to the right of the schema. At any time during the creation process, it is possible to switch to the edit as text mode and back. The text mode shows the JSON code, whereas the original view shows the fields in a table.

Each field requires:

  • a name – this has to be unique (*)

  • a type

    • String – collection of characters

    • Bytes – raw binary data

    • Integer – whole numbers

    • Float – fractional numbers

    • Numeric – any number (**)

    • Boolean – only options are “true” or “false”

    • Timestamp - Stores number of (milli)seconds passed since the Unix epoch

    • Date - Stores date in the format YYYY-MM-DD

    • Time - Stores time in the format HH:MI:SS

    • Datetime - Stores date and time information in the format YYYY-MM-DD HH:MI:SS

    • Record – has a child field

    • Variant - can store a value of any other type, including OBJECT and ARRAY

  • a mode

    • Required - Mandatory field

    • Nullable - Field is allowed to have no value

    • Repeated - Multiple values are allowed in this field (will be recognized as array in Snowflake)

Note (*) Do not use reserved Snowflake keywords such as left, right, sample, select, table,... (https://docs.snowflake.com/en/sql-reference/reserved-keywords) for your schema name as this will lead to SQL compilation errors.

Note (**) Defining the precision when creating tables with SQL is not supported as this will result in rounding issues.

From template

Users can create their own template by making a table which is turned into a template at Projects > your_project > Base > Tables > your_table > Save as template.

If a template is created and available/active, it is possible to create a new table based on this template. The table information and references follow the rules of the empty table but in this case the schema will be pre-filled. It is possible to still edit the schema that is based on the template.

Table information

Table status

The status of a table can be found at Projects > your_project > Base > Tables. The possible statuses are:

  • Available: Ready to be used, both with or without data

  • Pending: The system is still processing the table, there is probably a process running to fill the table with data

  • Deleted: The table is deleted functionally; it still exists and can be shown in the list again by clicking the “Show deleted tables” button

Additional Considerations

  • Tables created from empty data or from a template are the fastest available.

  • When copying a table with data, it can remain in a Pending for longer periods of time.

  • Clicking on the page's refresh button will update the list.

Table details

For any available table, the following details are shown:

  • Table information: Name, description, number of records and data size

  • Schema definition: An overview of the table schema, also available in text. Fields can be added to the schema but not deleted. For deleting fields: copy the schema as text and paste in a new empty table where the schema is still editable.

  • Preview: A preview of the table for the 50 first rows (when data is uploaded into the table)

  • Source Data: the files that are currently uploaded into the table. You can see the Load Status of the files which can be Prepare Started, Prepare Succeeded or Prepare Failed and finally Load Succeeded or Load Failed. You can change the order and visible columns by hovering over the column headers and clicking on the cog symbol.

Table actions

From within the details of a table it is possible to perform the following actions related to the table:

  • Copy: Create a copy from this table in the same or a different project. In order to copy to another project, data sharing of the original project should be enabled in the details of this project. The user also has to have access to both original and target project.

  • Export as file: Export this table as a CSV or JSON file. The exported file can be found in a project where the user has the access to download it.

  • Save as template: Save the schema or an edited form of it as a template.

  • Add data: Load additional data into the table manually. This can be done by selecting data files previously uploaded to the project, or by dragging and dropping files directly into the popup window for adding data to the table. It’s also possible to load data into a table manually or automatically via a pre-configured job. This can be done on the Schedule page.

  • Delete: Delete the table.

Manually importing data to your Table

To manually add data to your table, go to Projects > your_project > Base > Tables > your_table > +Add Data

Data selection

The data selection screen will show options to define the structure and location of your source data:

  • Write preference: Define if data can be written to the table only when the table is empty, if the data should be appended to the table or if the table should be overwrtitten.

  • Data format (required): Select the format of the data which you want to import. CSV(comma-separated), TSV (tab-separated) or JSON (JavaScript Object Notation).

  • Delimiter: Which delimiter is used in the delimiter separated file. If the required delimiter is not comma, tab or pipe, select custom and define the custom delimiter.

  • Custom delimiter: If a custom delimiter is used in the source data, it must be defined here.

  • Header rows to skip: The number of consecutive header rows (at the top of the table) to skip.

  • References: Choose which references must be added to the table.

Most of the advanced options are legacy functions and should not be used. The only exceptions are

  • Encoding: Select if the encoding is UTF-8 (any Unicode character) or ISO-8859-1 (first 256 Unicode characters).

  • Ignore unknown values: This applies to CSV-formatted files. You can use this function to handle optional fields without separators, provided that the missing fields are located at the end of the row. Otherwise, the parser can not detect the missing separator and will shift fields to the left, resulting in errors.

    • If headers are used: The columns that have matching fields are loaded, those that have no matching fields are loaded with NULL and remaining fields are discarded.

    • If no headers are used: The fields are loaded in order of occurrence and trailing missing fields are loaded with NULL, trailing additional fields are discarded.

At the bottom of the select data screen, you can select the data you manually want to upload. You can select local files, drop files via the browser or choose files from your project.

Data import progress

To see the status of your data import, go to Projects > your_project > Activity > Base Jobs where you will see a job of type Prepare Data which will have succeeded or failed. If it has failed, you can see the error message and details by double-clicking the base job. You can then take corrective actions if the input mismatched with the table design and try to run the import again (with a new copy of the file as each input file can only be used once)

If you need to cancel the import, you can do so while it is scheduled by navigating to the Base Jobs inventory and selecting the job followed by Abort.

List of table data sources

To see which data has been used to populate your table go to Projects > your_project > Base > Tables > your_table > Source Data. This will list all the source data files, even those that failed to be imported. You can not use these files anymore to import again to prevent double entries.

How to load array data in Base

Base Table schema definitions do not include an array type, but arrays can be ingested using either the Repeated mode for arrays containing a single type (ie, String), or the Variant type.

Last updated