Cohort Analysis
Cohort Analysis
From the Cohorts menu in the left hand navigation, select a cohort created in Create Cohort
to begin a cohort analysis.
Query Details
The query details can be accessed by clicking the triangle next to Show Query Details
. The query details displays the selections used to create a cohort. The selections can be edited by clicking the pencil
icon in the top right.
Charts
Charts
will be open by default. If not, clickShow Charts
.Use the gear icon in the top-right to change viewable chart settings.
There are four charts available to view summary counts of attributes within a cohort as histogram plots.
Click
Hide Charts
to hide the histograms.
Single Subject Timeline View:
Display time-stamped events and observations for a single subject on a timeline.The timeline view is visible to only those subjects which have time-series data.
Below attributes are displayed in timeline view: • Diagnosed and Self-Reported Diseases: • Start and end dates • Progression vs. remission • Medication and Other Treatments: • Prescribed and self-medicated • Start date, end date, and dosage at every time point
The timeline utilizes age (at diagnosis, at event, at measurement) as the x-axis and attribute name as the y-axis. If the birthdate is not recorded for a subject, the user can now switch to Date to visualize data.
In the default view, the timeline shows the first five disease data and the first five drug/medication data in the plot. Users can choose different attributes or change the order of existing attributes by clicking on the “select attribute” button.
The x-axis shows the person’s age in years, with data points initially displayed between ages 0 to 100. Users can zoom in by selecting the desired range to visualize data points within the selected age range.
Each event is represented by a dot in the corresponding track. Events in the same track can be connected by lines to indicate the start and end period of an event.
Subjects
By Default, the
Subjects
tab is displayed.The
Subjects
tab with a list of all subjects matching your criteria is displayed belowCharts
with a link to each Subject by ID and other high-level information. By clicking a subject ID, you will be brought to the data collected at the Subject level.Search for a specific subject by typing the Subject ID into the
Search Subjects
text box.Get all details available on a subject by clicking the hyperlinked Subject ID in the Subject list.
To Exclude specific subjects from subsequent analysis, such as marker frequencies or gene-level aggregated views, you can uncheck the box at the beginning of each row in the subject list. You will then be prompted to save any exclusion(s).
You can Export the list of subjects either to your ICA Project's data folder or to your local disk as a TSV file for subsequent use. Any export will omit subjects that you excluded after you saved those changes. For more information, see at the bottom of this page.
Remove a Subject
Specific subjects can be removed from a Cohort.
Select the
Subjects
tab.Subjects in the Cohort, by default are checked.
To remove a specific subject from a Cohort, uncheck the checkbox next to subjects to remove from a Cohort.
Check box selections are maintained while browsing through the pages of the subject list.
Click
Save Cohort
to save the subjects you would like to exclude.The specific subjects will no longer be counted in all analysis visualizations.
The specific excluded subjects will be saved for the Cohort.
To add the subjects back to the Cohort, select the checkboxes to checked and click
Save Cohort
.
Structural variant aggregation: Marker Frequency analysis
For each individual cohort, display a table of all observed SVs that overlap with a given gene.
Marker Frequency
Click the
Marker Frequency
tab, then click theGene Expression
tab.Down-regulated genes are displayed in blue and Up-regulated genes are displayed in red.
A frequency in the Cohort is displayed and the Matching number/Total is also displayed in the chart.
Genes can be searched by using the
Search Genes
text box.
Genes
You are brought to the
Gene
tab under theGene Summary
sub-tab.Select a Gene by typing the gene name into the
Search Genes
text box.A
Gene Summary
will be displayed that lists information and links to public resources about the selected gene.A cytogenic map will be displayed based on the selected gene and a vertical orange bar represents gene location in the chromosome.
Click the
Variants
tab andShow legend and filters
if it does not open by default.Below the interactive legend, you see a set of analysis tracks: Needle Plot, Primate AI, Pathogenic variants, and Exons.
The Needle Plot allows toggling the plot by
gnomAD frequency
andSample Count
. SelectSample Count
in thePlot by
legend above the plot. You can also filter the plot to only show variants above/below a certain cut-off for gnomAD frequency (in percent) or absolute sample count.Click on a variant's needle pin to view details about the variant from public resources and counts of variants in the selected cohort by disease category. If you want to view all subjects that carry the given variant, click on the sample count link, which will take you to the list of subjects (see above).
Use the Exon zoom bar from each end of the Amino Acid sequence to zoom in on the gene domain to better separate observations.
The
Pathogenic Variant
Track shows pop up details with pathogenicity calls, phenotypes, submitter and a link to the ClinVar entry is seen by hovering over the purple triangles.Below the needle plot is a full listing of variants displayed in the needle plot visualization
Display only variants shown in the plot above. toggle (enabled by default) syncs the table with the Needle Plot. When the toggle is on, the table will display only the variants shown in the Needle Plot, applying all active filters (e.g., variant type, somatic/germline, sample count). When the toggle is off, all reported variants are displayed in the table and table-based filters can be used.
Export to CSV: When the views are synchronized (toggle on), the filtered list of variants can be exported to a CSV file for further analysis.The
Phenotypes tab
shows a stacked horizontal bar chart which displays molecular breakdown (disease type vs Gene) and subject count for the selected gene.
The
Gene Expression
tab shows known gene expression data from tissue types in GTEx.The
Genetic Burden Test
will only be available forde novo
variants only.
Correlation
For every correlation, subjects contained in each count can be viewed by selecting the count on the bubble or the count on the X-axis and Y-axis.
Clinical vs. Clinical Attribute Comparison – Bubble Plot
Click the
Correlation
Tab.In
X-axis category
, selectClinical
.In
X-axis Attribute
, select a clinical attribute.In
Y-axis category
, selectClinical
.In
Y-Axis Attribute
, select another clinical attribute.You will be shown a bubble plot comparing the first clinical attribute on the x-axis to second attributes on the y-axis.
The size of the bubbles correspond to the number of subjects falling into those categories.
Molecular vs. Molecular Attribute Comparison – Bubble Plot
To see a breakdown of Somatic Mutations vs. RNA Expression levels perform the following steps:
Note this comparison is for a Cancer case.
Click the
Correlation
Tab.In
X-axis category
, selectSomatic
.In
X-axis Attribute
, select a gene.In
Y-axis category
, selectRNA expression
.In
Y-Axis Attribute
, type a gene and leaveReference Type
,NORMAL
.Click
Continuous
to see violin plots of compared variables.
Clinical vs. Molecular Attribute Comparison – Bubble Plot
Note this comparison is for a Cancer case.
Click the
Correlation
Tab.In
X-axis category
, selectSomatic
.In
X-axis Attribute
, type a gene name.In
Y-axis category
, selectClinical
.In
Y-Axis Attribute
, select a clinical attribute.
Molecular Breakdown
Click the
Molecular Breakdown
Tab.In
Enter a clinical Attribute
, and select a clinical attribute.In
Enter a gene
, select a gene by typing a gene name.You are shown a stacked bar-chart by the clinical attribute selected values on the Y-axis.
For each attribute value the bar represents the % of Subjects with
RNA Expression
,Somatic Mutation
, andMultiple Alterations
.
Note: for each of the aforementioned bubble plots, you can view the list of subjects by following the link under each subject count associated with an individual bubble or axis label. This will take you to the list of subjects view, see above.
CNV
If there is Copy Number Variant data in the cohort:
Click the
CNV
tab.A graph will show CNV a Sample Percentage on the Y-axis and Chromosomes on the X-axis.
Any value above Zero is a copy number gain, and any value below Zero is a copy number loss.
Click
Chromosome:
to select a specific chromosome position.
Subject Export for Analysis in ICA Bench
ICA allows for integrated analysis in a computation workspace. You can export your cohorts definitions and, in combination with molecular data in your ICA Project Data, perform, for example, a GWAS analysis.
Confirm the VCF data for your analysis is in ICA Project Data.
From within your ICA Project, Start a Bench Workspace -- See Bench for more details.
Navigate back to ICA Cohorts.
Create a Cohort of subjects of interest using Create a Cohort.
From the
Subjects
Tab click theExport subjects...
from the top-right of the subject list. The file can be downloaded to the Browser or ICA Project Data.We suggest using export
...to Data Folder
for immediate access to this data in Bench or other areas of ICA.Create another cohort if needed for your Research and complete the last 3 steps.
Navigate to the Bench workspace created in the second step.
After the workspace has started up, click
Access
.Find the
/Project/
folder in the Workspace file navigation.This folder will contain your cohort files created along with any pipeline output data needed for your workspace analysis.
Last updated