This tutorial demonstrates how to use the ICA Python library packaged with the JupyterLab image for Bench Workspaces.
See the JupyterLab documentation for details about the JupyterLab docker image provided by Illumina.
The tutorial will show how authentication to the ICA API works and how to search, upload, download and delete data from a project into a Bench Workspace. The python code snippets are written for compatibility with a Jupyter Notebook.
Python modules
Navigate to Bench > Workspaces and click Enable to enable workspaces. Select +New Workspace to create a new workspace. Fill in the required details and select JupyterLab for the Docker image. Click Save and Start to open the workspace. The following snippets of code can be pasted into the workspace you've created.
This snippet defines the required python modules for this tutorial:
These snippets show how to manage data in a project. Operations shown are:
Create a Project Data API client instance
List all data in a project
Create a data element in a project
Upload a file to a data element in a project
Download a data element from a project
Search for matching data elements in a project
Delete matching data elements in a project
# Retrieve project ID from the Bench workspace environmentprojectId = os.environ['ICA_PROJECT']
# Create a Project Data API client instanceprojectDataApiInstance = project_data_api.ProjectDataApi(apiClient)
List Data
# List all data in a projectpageOffset =0pageSize =30try: projectDataPagedList = projectDataApiInstance.get_project_data_list(project_id = projectId, page_size =str(pageSize), page_offset =str(pageOffset)) totalRecords = projectDataPagedList.total_item_countwhile pageOffset*pageSize < totalRecords:for projectData in projectDataPagedList.items:print("Path: "+projectData.data.details.path +" - Type: "+projectData.data.details.data_type) pageOffset = pageOffset +1except icav2.ApiException as e:print("Exception when calling ProjectDataAPIApi->get_project_data_list: %s\n"% e)
Create Data
# Create data element in a projectdata = icav2.model.create_data.CreateData(name="test.txt",data_type ="FILE")try: projectData = projectDataApiInstance.create_data_in_project(projectId, create_data=data) fileId = projectData.data.idexcept icav2.ApiException as e:print("Exception when calling ProjectDataAPIApi->create_data_in_project: %s\n"% e)
Upload Data
## Upload a local file to a data element in a project# Create a local file in a Bench workspacefilename ='/tmp/'+''.join(random.choice(string.ascii_lowercase) for i inrange(10))+".txt"content =''.join(random.choice(string.ascii_lowercase) for i inrange(100))f =open(filename, "a")f.write(content)f.close()# Calculate MD5 hash (optional)localFileHash = md5Hash = hashlib.md5((open(filename, 'rb').read())).hexdigest()try:# Get Upload URL upload = projectDataApiInstance.create_upload_url_for_data(project_id = projectId, data_id = fileId)# Upload dummy file files ={'file':open(filename, 'r')} data =open(filename, 'r').read() r = requests.put(upload.url , data=data)except icav2.ApiException as e:print("Exception when calling ProjectDataAPIApi->create_upload_url_for_data: %s\n"% e)# Delete local dummy fileos.remove(filename)
Download Data
## Download a data element from a projecttry:# Get Download URL download = projectDataApiInstance.create_download_url_for_data(project_id=projectId, data_id=fileId)# Download file filename ='/tmp/'+''.join(random.choice(string.ascii_lowercase) for i inrange(10))+".txt" r = requests.get(download.url)open(filename, 'wb').write(r.content)# Verify md5 hash remoteFileHash = hashlib.md5((open(filename, 'rb').read())).hexdigest()if localFileHash != remoteFileHash:print("Error: MD5 mismatch")# Delete local dummy file os.remove(filename)except icav2.ApiException as e:print("Exception when calling ProjectDataAPIApi->create_download_url_for_data: %s\n"% e)
Search for Data
# Search for matching data elements in a projecttry: projectDataPagedList = projectDataApiInstance.get_project_data_list(project_id = projectId, full_text="test.txt")for projectData in projectDataPagedList.items:print("Path: "+ projectData.data.details.path +" - Name: "+projectData.data.id +" - Type: "+projectData.data.details.data_type)except icav2.ApiException as e:print("Exception when calling ProjectDataAPIApi->get_project_data_list: %s\n"% e)
Delete Data
# Delete matching data elements in a projecttry: projectDataPagedList = projectDataApiInstance.get_project_data_list(project_id = projectId, full_text="test.txt")for projectData in projectDataPagedList.items:print("Deleting file "+projectData.data.details.path) projectDataApiInstance.delete_data(project_id = projectId, data_id = projectData.data.id)except icav2.ApiException as e:print("Exception %s\n"% e)
Base Operations
These snippets show how to get a connection to a base database and run an example query. Operations shown are:
Create a python jdbc connection
Create a table
Insert data into a table
Query the table
Delete the table
Snowflake Python API documentation can be found here
This snipppet defines the required python modules for this tutorial: