Cohort Analysis

Cohort Analysis

From the Cohorts menu in the left hand navigation, select a cohort created in Create Cohort to begin a cohort analysis.

Query Details

The query details can be accessed by clicking the triangle next to Show Query Details. The query details displays the selections used to create a cohort. The selections can be edited by clicking the pencil icon in the top right.

Charts

  1. Charts will be open by default. If not, click Show Charts.

  2. Use the gear icon in the top-right to change viewable chart settings.

  3. There are four charts available to view summary counts of attributes within a cohort as histogram plots.

  4. Click Hide Charts to hide the histograms.

Single Subject Timeline View:

  1. Display time-stamped events and observations for a single subject on a timeline.The timeline view is visible to only those subjects which have time-series data.

  2. Below attributes are displayed in timeline view: • Diagnosed and Self-Reported Diseases: • Start and end dates • Progression vs. remission • Medication and Other Treatments: • Prescribed and self-medicated • Start date, end date, and dosage at every time point

  3. The timeline utilizes age (at diagnosis, at event, at measurement) as the x-axis and attribute name as the y-axis. If the birthdate is not recorded for a subject, the user can now switch to Date to visualize data.

  4. In the default view, the timeline shows the first five disease data and the first five drug/medication data in the plot. Users can choose different attributes or change the order of existing attributes by clicking on the “select attribute” button.

  5. The x-axis shows the person’s age in years, with data points initially displayed between ages 0 to 100. Users can zoom in by selecting the desired range to visualize data points within the selected age range.

  6. Each event is represented by a dot in the corresponding track. Events in the same track can be connected by lines to indicate the start and end period of an event.

Subjects

  1. By Default, the Subjects tab is displayed.

  2. The Subjects tab with a list of all subjects matching your criteria is displayed below Charts with a link to each Subject by ID and other high-level information. By clicking a subject ID, you will be brought to the data collected at the Subject level.

  3. Search for a specific subject by typing the Subject ID into the Search Subjects text box.

  4. Get all details available on a subject by clicking the hyperlinked Subject ID in the Subject list.

To Exclude specific subjects from subsequent analysis, such as marker frequencies or gene-level aggregated views, you can uncheck the box at the beginning of each row in the subject list. You will then be prompted to save any exclusion(s).

You can Export the list of subjects either to your ICA Project's data folder or to your local disk as a TSV file for subsequent use. Any export will omit subjects that you excluded after you saved those changes. For more information, see at the bottom of this page.

Remove a Subject

  1. Specific subjects can be removed from a Cohort.

  2. Select the Subjects tab.

  3. Subjects in the Cohort, by default are checked.

  4. To remove a specific subject from a Cohort, uncheck the checkbox next to subjects to remove from a Cohort.

  5. Check box selections are maintained while browsing through the pages of the subject list.

  6. Click Save Cohort to save the subjects you would like to exclude.

  7. The specific subjects will no longer be counted in all analysis visualizations.

  8. The specific excluded subjects will be saved for the Cohort.

  9. To add the subjects back to the Cohort, select the checkboxes to checked and click Save Cohort.

Structural variant aggregation: Marker Frequency analysis

For each individual cohort, display a table of all observed SVs that overlap with a given gene.

Marker Frequency

  1. Click the Marker Frequency tab, then click the Gene Expression tab.

  2. Down-regulated genes are displayed in blue and Up-regulated genes are displayed in red.

  3. A frequency in the Cohort is displayed and the Matching number/Total is also displayed in the chart.

  4. Genes can be searched by using the Search Genes text box.

Genes

  1. You are brought to the Gene tab under the Gene Summary sub-tab.

  2. Select a Gene by typing the gene name into the Search Genes text box.

  3. A Gene Summary will be displayed that lists information and links to public resources about the selected gene.

  4. A cytogenic map will be displayed based on the selected gene and a vertical orange bar represents gene location in the chromosome.

  5. Click the Variants tab and Show legend and filters if it does not open by default.

  6. Below the interactive legend, you see a set of analysis tracks: Needle Plot, Primate AI, Pathogenic variants, and Exons.

  7. The Needle Plot allows toggling the plot by gnomAD frequency and Sample Count. Select Sample Count in the Plot by legend above the plot. You can also filter the plot to only show variants above/below a certain cut-off for gnomAD frequency (in percent) or absolute sample count.

  8. Click on a variant's needle pin to view details about the variant from public resources and counts of variants in the selected cohort by disease category. If you want to view all subjects that carry the given variant, click on the sample count link, which will take you to the list of subjects (see above).

  9. Use the Exon zoom bar from each end of the Amino Acid sequence to zoom in on the gene domain to better separate observations.

  10. The Pathogenic Variant Track shows pop up details with pathogenicity calls, phenotypes, submitter and a link to the ClinVar entry is seen by hovering over the purple triangles.

  11. Below the needle plot is a full listing of variants displayed in the needle plot visualization

    • Display only variants shown in the plot above. toggle (enabled by default) syncs the table with the Needle Plot. When the toggle is on, the table will display only the variants shown in the Needle Plot, applying all active filters (e.g., variant type, somatic/germline, sample count). When the toggle is off, all reported variants are displayed in the table and table-based filters can be used.

    • Export to CSV: When the views are synchronized (toggle on), the filtered list of variants can be exported to a CSV file for further analysis.The Phenotypes tab shows a stacked horizontal bar chart which displays molecular breakdown (disease type vs Gene) and subject count for the selected gene.

  12. The Gene Expression tab shows known gene expression data from tissue types in GTEx.

  13. The Genetic Burden Test will only be available for de novo variants only.

Correlation

For every correlation, subjects contained in each count can be viewed by selecting the count on the bubble or the count on the X-axis and Y-axis.

Clinical vs. Clinical Attribute Comparison – Bubble Plot

  1. Click the Correlation Tab.

  2. In X-axis category, select Clinical.

  3. In X-axis Attribute, select a clinical attribute.

  4. In Y-axis category, select Clinical.

  5. In Y-Axis Attribute, select another clinical attribute.

  6. You will be shown a bubble plot comparing the first clinical attribute on the x-axis to second attributes on the y-axis.

  7. The size of the bubbles correspond to the number of subjects falling into those categories.

Molecular vs. Molecular Attribute Comparison – Bubble Plot

To see a breakdown of Somatic Mutations vs. RNA Expression levels perform the following steps:

Note this comparison is for a Cancer case.

  1. Click the Correlation Tab.

  2. In X-axis category, select Somatic.

  3. In X-axis Attribute, select a gene.

  4. In Y-axis category, select RNA expression.

  5. In Y-Axis Attribute, type a gene and leave Reference Type, NORMAL.

  6. Click Continuous to see violin plots of compared variables.

Clinical vs. Molecular Attribute Comparison – Bubble Plot

Note this comparison is for a Cancer case.

  1. Click the Correlation Tab.

  2. In X-axis category, select Somatic.

  3. In X-axis Attribute, type a gene name.

  4. In Y-axis category, select Clinical.

  5. In Y-Axis Attribute, select a clinical attribute.

Molecular Breakdown

  1. Click the Molecular Breakdown Tab.

  2. In Enter a clinical Attribute, and select a clinical attribute.

  3. In Enter a gene, select a gene by typing a gene name.

  4. You are shown a stacked bar-chart by the clinical attribute selected values on the Y-axis.

  5. For each attribute value the bar represents the % of Subjects with RNA Expression, Somatic Mutation, and Multiple Alterations.

Note: for each of the aforementioned bubble plots, you can view the list of subjects by following the link under each subject count associated with an individual bubble or axis label. This will take you to the list of subjects view, see above.

CNV

If there is Copy Number Variant data in the cohort:

  1. Click the CNV tab.

  2. A graph will show CNV a Sample Percentage on the Y-axis and Chromosomes on the X-axis.

  3. Any value above Zero is a copy number gain, and any value below Zero is a copy number loss.

  4. Click Chromosome: to select a specific chromosome position.

Subject Export for Analysis in ICA Bench

ICA allows for integrated analysis in a computation workspace. You can export your cohorts definitions and, in combination with molecular data in your ICA Project Data, perform, for example, a GWAS analysis.

  1. Confirm the VCF data for your analysis is in ICA Project Data.

  2. From within your ICA Project, Start a Bench Workspace -- See Bench for more details.

  3. Navigate back to ICA Cohorts.

  4. Create a Cohort of subjects of interest using Create a Cohort.

  5. From the Subjects Tab click the Export subjects... from the top-right of the subject list. The file can be downloaded to the Browser or ICA Project Data.

  6. We suggest using export ...to Data Folder for immediate access to this data in Bench or other areas of ICA.

  7. Create another cohort if needed for your Research and complete the last 3 steps.

  8. Navigate to the Bench workspace created in the second step.

  9. After the workspace has started up, click Access.

  10. Find the /Project/ folder in the Workspace file navigation.

  11. This folder will contain your cohort files created along with any pipeline output data needed for your workspace analysis.

Last updated