LogoLogo
Illumina Connected Software
  • Introduction
  • Get Started
    • About the Platform
    • Get Started
  • Home
    • Projects
    • Bundles
    • Event Log
    • Metadata Models
    • Docker Repository
    • Tool Repository
    • Storage
      • Connect AWS S3 Bucket
        • SSE-KMS Encryption
  • Project
    • Data
      • Data Integrity
    • Samples
    • Activity
    • Flow
      • Reference Data
      • Pipelines
        • Nextflow
        • CWL
        • XML Input Form
        • 🆕JSON-Based input forms
          • InputForm.json Syntax
          • JSON Scatter Gather Pipeline
        • Tips and Tricks
      • Analyses
    • Base
      • Tables
        • Data Catalogue
      • Query
      • Schedule
      • Snowflake
    • Bench
      • Workspaces
      • JupyterLab
      • 🆕Bring Your Own Bench Image
      • 🆕Bench Command Line Interface
      • 🆕Pipeline Development in Bench (Experimental)
        • Creating a Pipeline from Scratch
        • nf-core Pipelines
        • Updating an Existing Flow Pipeline
      • 🆕Containers in Bench
      • FUSE Driver
    • Cohorts
      • Create a Cohort
      • Import New Samples
      • Prepare Metadata Sheets
      • Precomputed GWAS and PheWAS
      • Cohort Analysis
      • Compare Cohorts
      • Cohorts Data in ICA Base
      • Oncology Walk-through
      • Rare Genetic Disorders Walk-through
      • Public Data Sets
    • Details
    • Team
    • Connectivity
      • Service Connector
      • Project Connector
    • Notifications
  • Command-Line Interface
    • Installation
    • Authentication
    • Data Transfer
    • Config Settings
    • Output Format
    • Command Index
    • Releases
  • Sequencer Integration
    • Cloud Analysis Auto-launch
  • Tutorials
    • Nextflow Pipeline
      • Nextflow DRAGEN Pipeline
      • Nextflow: Scatter-gather Method
      • Nextflow: Pipeline Lift
        • Nextflow: Pipeline Lift: RNASeq
      • Nextflow CLI Workflow
    • CWL CLI Workflow
      • CWL Graphical Pipeline
      • CWL DRAGEN Pipeline
      • CWL: Scatter-gather Method
    • Base Basics
      • Base: SnowSQL
      • Base: Access Tables via Python
    • Bench ICA Python Library
    • API Beginner Guide
    • Launch Pipelines on CLI
      • Mount projectdata using CLI
    • Data Transfer Options
    • Pipeline Chaining on AWS
    • End-to-End User Flow: DRAGEN Analysis
  • Reference
    • Software Release Notes
      • 2025
      • 2024
      • 2023
      • 2022
      • 2021
    • Document Revision History
      • 2025
      • 2024
      • 2023
      • 2022
    • Known Issues
    • API
    • Pricing
    • Security and Compliance
    • Network Settings
    • ICA Terminology
    • Resources
    • Data Formats
    • FAQ
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Project
  2. Cohorts

Public Data Sets

PreviousRare Genetic Disorders Walk-throughNextDetails

Last updated 1 year ago

Was this helpful?

ICA Cohorts comes front-loaded with a variety of publicly accessible data sets, covering multiple disease areas and also including healthy individuals.

Data set
Samples
Diseases/Phenotypes
Reference

1kGP-DRAGEN

3202 WGS: 2504 original samples plus 698 relateds

Presumed healthy

DDD

4293 (3664 affected), de novos only

Developmental disorders

EPI4K

356, de novos only

Epilepsy

ASD Cohorts

6786 (4266 affected), de novos only

Autism Spectrum disorder

; ; ; ; ;

De Ligt et al.

100, de novos only

Intellectual disability

Homsy et al.

1213, de novos only

Congenital heart disease (HP:0030680)

Lelieveld et al.

820, de novos only

Intellectual disability

Rauch et al.

51, de novos only

Intellectual disability

Rare Genomes Project

315 WES (112 pedigrees)

Various

https://raregenomes.org/

TCGA

ca. 4200 WES, ca. 4000 RNAseq

12 tumor types

https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga

GEO

RNAseq

Auto-immune disorders, incl. asthma, arthritis, SLE, MS, Crohn's disease, Psoriasis, Sjögren's Syndrome

For GEO/GSE study identifiers, please refer to the in-product list of studies

RNAseq

Kidney diseases

For GEO/GSE study identifiers, please refer to the in-product list of studies

RNAseq

Central nervous system diseases

For GEO/GSE study identifiers, please refer to the in-product list of studies

RNAseq

Parkinson's disease

For GEO/GSE study identifiers, please refer to the in-product list of studies

DRAGEN reanalysis of the 1000 Genomes Dataset
McRae et al., Nature 19:1194-1196
Epi4K Consortium, Nature 501:217-221
Iossifov et al. Neuron 74:285-299
Iossifov et al. Nature 498:216-221
O'Roak et al. Nature 485:246-250
Sanders et al. Nature 485:237-241
Sanders et al. Neuron 87:1215-1233
De Rubeis et al. Nature 515:209-215
De Ligt et al., N Engl J Med 367:1921-1929
Homsy et al., Science 350:1262-1266
Lelieveld et al., Nature Neuroscience19:1194-1196
Rauch et al., Lancet 380:1674-1682