Skip to content Skip to footer

Tools and resources list

Skip tool table

Tool or resource Description Related pages Registry
1+ Million Genomes (1+MG) The 1+ Million Genomes (1+MG) initiative aims to enable secure access to genomics and the corresponding clinical data across Europe for better research, personalised healthcare and health policy making. Since the Digital Day 2018, 25 EU countries, the UK and Norway signed Member States declaration on stepping up efforts towards creating a European data infrastructure for genomic data and implementing common national rules enabling federated data access. The initiative forms part of the EU's agenda for the Digital Transformation of Health and Care and is aligned with the goals of the European Health Data Space. Training
ACE Cohort Asymptomatic COVID-19 in Education (ACE) Cohort Data sources Training
ADA-M Responsible sharing of biomedical data and biospecimens via the Automatable Discovery and Access Matrix (ADA-M). The Automatable Discovery and Access Matrix (ADA-M) provides a standardized way to unambiguously represent the conditions related to data discovery and access. By adopting ADA-M, data custodians can generally describe what their data are (the Header section), who can access them (the Permissions section), terms related to their use (the Terms section), and special conditions (the Meta-Conditions). By doing so, data custodians can participate in data sharing and collaboration by making meta information about their data computer-readable and hence directly available for digital communication, searching and automation activities. Data sources Tool info
ANNOVAR ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes. Data analysis Tool info
ArrayExpress ArrayExpress is a database of functional genomics experiments that can be queried and the data downloaded. It includes gene expression data from microarray and high throughput sequencing studies. Data is collected to MIAME and MINSEQE standards. Experiments are submitted directly to ArrayExpress or are imported from the NCBI GEO database. Data sources Tool info Standards/Databases Training
Arvados With Arvados, bioinformaticians run and scale compute-intensive workflows, developers create biomedical applications, and IT administrators manage large compute and storage resources. Data analysis
Beacon Beacon v2 is a protocol/specification established by the Global Alliance for Genomics and Health initiative (GA4GH) that defines an open standard for federated discovery of genomic data and associated information in biomedical research and clinical applications. The Beacon v2 standard - a major update from the original GA4GH Beacon first proposed in 2014 - was accepted by GA4GH in April 2022. Beacon v2 consists of two components, the Framework and the Models. The Framework contains the format for the requests and responses, whereas the Models define the structure of the biological data response. Data sources Tool info Standards/Databases Training
Beyond 1 Million Genomes (B1MG) The Beyond 1 Million Genomes (B1MG) project is helping to create a network of genetic and clinical data across Europe. The project provides coordination and support to the 1+ Million Genomes Initiative (1+MG). This initiative is a commitment of 24 EU countries, the UK and Norway to give cross-border access to one million sequenced genomes by 2022.
BioGRID BioGRID is a comprehensive biomedical repository for curated protein, genetic and chemical interactions Data analysis Tool info Standards/Databases
BioSamples BioSamples stores and supplies descriptions and metadata about biological samples used in research and development by academia and industry. Samples are either 'reference' samples (e.g. from 1000 Genomes, HipSci, FAANG) or have been used in an assay database such as the European Nucleotide Archive (ENA) or ArrayExpress. It provides links to assays and specific samples, and accepts direct submissions of sample information. Data sources Tool info Standards/Databases Training
Bismark Bismark is a program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step. Data analysis Tool info Training
Bitbucket Git based code hosting and collaboration tool, built for teams. Data analysis Standards/Databases
Bowtie2 Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. Data analysis Tool info Training
BWA BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome. Data analysis Tool info Training
Canu Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing. Data analysis Tool info
CESSDA Data Catalogue (CDC) CDC is a one-stop shop for searching and finding European social science data. Data sources Tool info Standards/Databases
CESSDA Vocabulary Service CESSDA Vocabulary Service enables users to discover, browse, and download controlled vocabularies in a variety of languages. The service is provided by the Consortium of European Social Science Data Archives (CESSDA). The majority of the source (English) vocabularies included in the service have been created by the DDI Alliance. The Data Documentation Initiative (DDI) is an international standard for describing data produced by surveys and other observational methods in the social, behavioural, economic, and health sciences. Standards/Databases
ClustalW ClustalW is a progressive multiple sequence alignment tool to align a set of sequences by repeatedly aligning pairs of sequences and previously generated alignments. Data analysis Tool info Training
Common Workflow Language (CWL) An open standard for describing workflows that are build from command line tools Provenance Standards/Databases Training
COMPSs COMP Superscalar (COMPSs) is a task-based programming model which aims to ease the development of applications for distributed infrastructures, such as large High-Performance clusters (HPC), clouds and container managed clusters. Provenance Tool info
Covid-19 data portal The COVID-19 Data Portal enables researchers to upload, access and analyse COVID-19 related reference data and specialist datasets. The aim of the COVID-19 Data Portal is to facilitate data sharing and analysis, and to accelerate coronavirus research. The portal includes relevant datasets submitted to EMBL-EBI as well as other major centres for biomedical data. The COVID-19 Data Portal is the primary entry point into the functions of a wider project, the European COVID-19 Data Platform. Data sources Data sources Tool info Standards/Databases Training
CRG COVID-19 Viral Beacon A platform allowing for browsing SARS-CoV-2 variability at the genome, amino acid, structural, and motif levels An automated SARS-CoV-...
Cromwell Cromwell is a Workflow Management System geared towards scientific workflows. Data analysis
cwltool Reference implementation to provide comprehensive validation of CWL files as well as provide other tools related to working with CWL. Data analysis
Cytoscape Cytoscape provides a solid platform for network visualization and analysis Data analysis Tool info Training
DAVID The Database for Annotation, Visualization and Integrated Discovery (DAVID) provides a comprehensive set of functional annotation tools for investigators to understand the biological meaning behind large lists of genes. Data analysis Tool info Training
dbGaP The Database of Genotypes and Phenotypes (dbGaP) archives and distributes the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. Data sources Tool info Standards/Databases Training
dbNSFP A comprehensive database of transcript-specific functional predictions and annotations for human non-synonymous and splice-site SNVs Data analysis Tool info
DCAT An RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. By using DCAT to describe datasets in data catalogs, publishers increase discoverability and enable applications easily to consume metadata from multiple catalogs. Data sources Standards/Databases
DeepVariant DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file. Data analysis Tool info
Delly Delly is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read and long-read massively parallel sequencing data. Data analysis Tool info
DESeq2 Differential gene expression analysis based on the negative binomial distribution Data analysis Tool info Training
DMP Online DMP online is an online planning tool to help you write an effective DMP based on an institutional or funder template. Training
Docker Docker is a software for the execution of applications in virtualized environments called containers. It is linked to DockerHub, a library for sharing container images Data analysis Standards/Databases Standards/Databases Training
Dragen-GATK DRAGEN-GATK Best Practices contains open-source workflows that are compatible between Illumina's platforms and mainstream infrastructure. Data analysis
Dryad Dryad is an open-source, community-led data curation, publishing, and preservation platform for CC0 publicly available research data. Dryad has a long-term data preservation strategy, and is a Core Trust Seal Certified Merritt repository with storage in US and EU at the San Diego Supercomputing Center, DANS, and Zenodo. While data is undergoing peer review, it is embargoed if the related journal requires / allows this. Dryad is an independent non-profit that works directly with: researchers to publish datasets utilising best practices for discovery and reuse; publishers to support the integration of data availability statements and data citations into their workflows; and institutions to enable scalable campus support for research data management best practices at low cost. Costs are covered by institutional, publisher, and funder members, otherwise a one-time fee of $120 for authors to cover cost of curation and preservation. Dryad also receives direct funder support through grants. Data sources Standards/Databases
Dutch COVID-19 Data Portal The dutch COVID-19 Data Portal provides researchers with a clear overview of what is available, allow searching for specific data and make access to such data easier when the necessary ethical and legal conditions have been met. Data sources
EBI The European Bioinformatics Institute is a bioinformatics research center that is part of the European Molecular Biology Laboratory and is located in Hinxton, England. The institution combines intense research activity with the development and maintenance of a set of bioinformatics lines, services and databases. Data sources Training
EdgeR Empirical Analysis of Digital Gene Expression Data in R Data analysis Tool info Training
ENA upload tool Submits experimental data and respective metadata to the European Nucleotide Archive (ENA). SARS-CoV-2 sequencing ...
Enrichr Functional Enrichment Analysis and Network Construction Tool info
Estonian Biobank The Estonian Biobank has established a population-based biobank of Estonia with a current cohort size of more than 200,000 individuals (genotyped with genome-wide arrays), reflecting the age, sex and geographical distribution of the adult Estonian population. Considering the fact that about 20% of Estonia's adult population has joined the programme, it is indeed a database that is very important for the development of medical science both domestically and internationally. Data sources
Estonian COVID-19 Data Portal Estonian instance of the COVID-19 Data Portal. Among other information, served Estonian SARS-CoV-2 sequencing dashboards. SARS-CoV-2 sequencing ...
EUI COVID-19 SSH Data Portal The COVID-19 SSH Data Portal provides integrated search, discovery, and linking to datasets published on the web relevant for COVID-19-related research in the Social Sciences and Humanities. Data sources Standards/Databases
European Genome-phenome Archive (EGA) The European Genome-phenome Archive (EGA) is a service for permanent archiving and sharing of personally identifiable genetic, phenotypic, and clinical data generated for the purposes of biomedical research projects or in the context of research-focused healthcare systems. Access to data must be approved by the specified Data Access Committee (DAC). Data sources Tool info Standards/Databases Training
European Language Social Science Thesaurus (ELSST) The European Language Social Science Thesaurus (ELSST) is a broad-based, multilingual thesaurus for the social sciences. It is owned and published by the Consortium of European Social Science Data Archives (CESSDA) and its national Service Providers. The thesaurus consists of over 3,300 concepts and covers the core social science disciplines: politics, sociology, economics, education, law, crime, demography, health, employment, information, communication technology, and environmental science. ELSST is used for data discovery within CESSDA and facilitates access to data resources across Europe, independent of domain, resource, language, or vocabulary. ELSST is currently available in 16 languages: Danish, Dutch, Czech, English, Finnish, French, German, Greek, Hungarian, Icelandic, Lithuanian, Norwegian, Romanian, Slovenian, Spanish, and Swedish Standards/Databases
European Nucleotide Archive (ENA) Provides a record of the nucleotide sequencing information. It includes raw sequencing data, sequence assembly information and functional annotation. Data sources Data description An automated SARS-CoV-... SARS-CoV-2 sequencing ... Tool info Standards/Databases Training
FAIRsharing FAIRsharing is a FAIR-supporting resource that provides an informative and educational registry on data standards, databases, repositories and policy, alongside search and visualization tools and services that interoperate with other FAIR-enabling resources. fairsharing guides consumers to discover, select and use standards, databases, repositories and policy with confidence, and producers to make their resources more discoverable, more widely adopted and cited. Each record in fairsharing is curated in collaboration with the maintainers of the resource themselves, ensuring that the metadata in the fairsharing registry is accurate and timely. Every record is manually reviewed at least once a year. Records can be collated into collections, based on a project, society or organisation, or Recommendations, where they are collated around a policy, such as a journal or funder data policy. Data sources Data description Standards/Databases Training
FASTQC A quality control tool for high throughput sequence data. Quality control
Federated EGA The Federated EGA is an infrastructure built upon the European Genome-phenome Archive (EGA), an EMBL-EBI and CRG data resource for secure archiving and sharing of human sensitive biomolecular and phenotypic data resulting from biomedical research projects. Data sources Training
Figshare Figshare is a generalist, subject-agnostic repository for many different types of digital objects that can be used without cost to researchers. Data can be submitted to the central figshare repository (described here), or institutional repositories using the figshare software can be installed locally, e.g. by universities and publishers. Metadata in figshare is licenced under is CC0. figshare has also partnered with DuraSpace and Chronopolis to offer further assurances that public data will be archived under the stewardship of Chronopolis. figshare is supported through Institutional, Funder, and Governmental service subscriptions. Data sources Standards/Databases Training
Flye Flye is a de novo assembler for single-molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. Data analysis Tool info Training
FreeBayes FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment. Data analysis Tool info Training
GA4GH The metadata model for GA4GH, an international coalition of both public and private interested parties, formed to enable the sharing of genomic and clinical data. Tool info Standards/Databases Training
Galaxy Open, web-based platform for data intensive biomedical research. Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses. Data analysis Provenance Tool info Training
Galaxy Europe The European Galaxy server. Provides access to thousands of tools for scalable and reproducible analysis. An automated SARS-CoV-...
Galaxy University of Tartu The University of Tartu Galaxy instance. Enables local university users to run their analyses in the Galaxy environment. Was heavily used during the KoroGenoEST sequencing studies. SARS-CoV-2 sequencing ...
GenBank GenBank is the NIH genetic sequence database of annotated collections of all publicly available DNA sequences. Data sources Tool info Standards/Databases Training
GeneMANIA GeneMANIA helps you predict the function of your favourite genes and gene sets. Data analysis Tool info Training
Genomic Data Infrastructure (GDI) The Genomic Data Infrastructure (GDI) project is enabling access to genomic and related phenotypic and clinical data across Europe. It is doing this by establishing a federated, sustainable and secure infrastructure to access the data. It builds on the outputs of the Beyond 1 Million Genomes (B1MG) project and is realising the ambition of the 1+Million Genomes (1+MG) initiative.
GEO The Gene Expression Omnibus (GEO) is a public repository that archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomic data submitted by the scientific community. Accepts next generation sequence data that examine quantitative gene expression, gene regulation, epigenomics or other aspects of functional genomics using methods such as RNA-seq, miRNA-seq, ChIP-seq, RIP-seq, HiC-seq, methyl-seq, etc. GEO will process all components of your study, including the samples, project description, processed data files, and will submit the raw data files to the Sequence Read Archive (SRA) on the researchers behalf. In addition to data storage, a collection of web-based interfaces and applications are available to help users query and download the studies and gene expression patterns stored in GEO. Data sources Standards/Databases Training
ggplot2 ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. Data analysis Tool info Training
GitHub GitHub is a versioning system, used for sharing code, as well as for sharing of small data. Data analysis Standards/Databases Standards/Databases Training
GitLab GitLab is an open source end-to-end software development platform with built-in version control, issue tracking, code review, CI/CD, and more. Self-host GitLab on your own servers, in a container, or on a cloud provider. Data analysis Standards/Databases Training
Global Initiative on Sharing All Influenza Data (GISAID) A web-based platform for sharing viral sequence data, initially for influenza data, and now for other pathogens (including SARS-CoV-2). Data description Standards/Databases
GO GO is to perform enrichment analysis on gene sets. Data analysis Tool info Training
GRIDSS GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements. Data analysis Tool info
GSEA Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states Data analysis Tool info Training
GTEx The Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a comprehensive public resource to study tissue-specific gene expression and regulation. Samples were collected from 53 non-diseased tissue sites across nearly 1000 individuals, primarily for molecular assays including WGS, WES, and RNA-Seq. Remaining samples are available from the GTEx Biobank. The GTEx Portal provides open access to data including gene expression, QTLs, and histology images. Data sources Tool info Standards/Databases Training
HISAT2 HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (whole-genome, transcriptome, and exome sequencing data) to a population of human genomes (as well as to a single reference genome). Data analysis Tool info Training
IGV The Integrative Genomics Viewer (IGV) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data. Data analysis Tool info Training
IntAct IntAct (Molecular Interaction Database) Website Data analysis Tool info Standards/Databases Training
KEGG A set of annotation maps for Kyoto encyclopedia of genes and genomes (KEGG) Data analysis Tool info Training
LimeSurvey LimeSurvey is a free and open source advanced online survey system to create online surveys.
Lumpy A probabilistic framework for structural variant discovery. Data analysis Tool info
MACS Model-based Analysis of ChIP-Seq (MACS), for identifying transcript factor binding sites. Data analysis Tool info Training
MAFFT MAFFT is a multiple sequence alignment program Data analysis Tool info
Manta Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. Data analysis Tool info
matplotlib Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Data analysis Tool info
MetaboAnalyst MetaboAnalyst is a comprehensive platform dedicated for metabolomics data analysis via user-friendly, web-based interface. Data analysis Tool info Training
Metagen-FastQC Cleans metagenomic reads to remove adapters, low-quality bases and host (e.g. human) contamination. SARS-CoV-2 sequencing ...
MethylKit methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. Data analysis Tool info
methylPipe Base resolution DNA methylation data analysis Data analysis Tool info
MetSign A computational platform for high-resolution mass spectrometry-based metabolomics Data analysis
MIABIS MIABIS represents the minimum information required to initiate collaborations between biobanks and to enable the exchange of biological samples and data. The aim is to facilitate the reuse of bio-resources and associated data by harmonizing biobanking and biomedical research. Data sources Standards/Databases
MultiQC MultiQC searches a given directory for analysis logs and compiles a HTML report. Quality control
MUSCLE MUSCLE is widely-used software for making multiple alignments of biological sequences. Data analysis Tool info Training
Mzmine MZmine 3 is an open-source software for mass-spectrometry data processing, with the main focus on LC-MS data. Data analysis Tool info
NCBI The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information. Training
Nextflow Nextflow is a framework for data analysis workflow execution Data analysis Tool info Training
Nextstrain Auspiece Estonian local instance of the Nextstrain Auspiece application that serves SARS-CoV-2 phylogenetic data SARS-CoV-2 sequencing ...
ODDISEI Secure ANalysis Environment (SANE) SANE is a virtual container in which the researcher can analyse sensitive data, while the data owner retains full control. Data sources
Omicsgenerator Omics Integrator is a package designed to integrate proteomic data, gene expression data and/or epigenetic data using a protein-protein interaction network. Data analysis
OpenMS OpenMS is an open-source software C++ library for LC-MS data management and analyses. Data analysis Tool info Training
Panther The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. Data sources Tool info Standards/Databases Training
PhyML PhyML is a software package that uses modern statistical approaches to analyse alignments of nucleotide or amino acid sequences in a phylogenetic framework. Data analysis Tool info
R Markdown R Markdown can help to turn your analyses into high quality documents, reports, presentations and dashboards. Training
R Shiny Shiny is an R package that makes it easy to build interactive web apps straight from R. Training
Research Object Crate (RO-Crate) RO-Crate is a lightweight approach to packaging research data with their metadata, using An RO-Crate is a structured archive of all the items that contributed to the research outcome, including their identifiers, provenance, relations and annotations. Provenance Standards/Databases
SAMtools Samtools is a suite of programs for interacting with high-throughput sequencing data. Quality control
Sapporo WES Implementation of Workflow Execution Service (WES) or so-called Workflow-as-a-Service. Provenance Tool info
SARS-CoV-2 Data Hubs Using technology that builds upon existing EMBL-EBI infrastructure, we provide SARS-CoV-2 Data Hubs to those public health agencies and other scientific groups responsible for generating viral sequence data from the outbreak at national or regional levels. Data sources is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond. Provenance Standards/Databases Training
SICER2 Redesigned and improved ChIP-seq broad peak calling tool SICER Data analysis
Singularity Singularity is a widely-adopted container runtime that implements a unique security model to mitigate privilege escalation risks and provides a platform to capture a complete application environment into a single file (SIF) Data analysis Training
Snakemake Snakemake is a framework for data analysis workflow execution Data analysis Provenance Tool info Training
SnpEff Genetic variant annotation and functional effect prediction toolbox. It annotates and predicts the effects of genetic variants on genes and proteins. Data analysis Tool info Training
SPAdes SPAdes is an assembly toolkit containing various assembly pipelines. Data analysis Tool info Training
SRA The SRA is NIH's primary archive of high-throughput sequencing data and is part of the International Nucleotide Sequence Database Collaboration (INSDC) that includes at the NCBI Sequence Read Archive (SRA), the European Bioinformatics Institute (EBI), and the DNA Database of Japan (DDBJ). Data submitted to any of the three organizations are shared among them. SRA accepts data from all kinds of sequencing projects including clinically important studies that involve human subjects or their metagenomes, which may contain human sequences. These data often have a controlled access via dbGaP (the database of Genotypes and Phenotypes). Data sources Tool info Standards/Databases Training
STAR Spliced Transcripts Alignment to a Reference Data analysis Tool info Training
StreamFlow Container-native workflow manager for hybrid infrastructures Provenance
TCGA The Cancer Genome Atlas (TCGA) is a comprehensive, collaborative effort led by the National Institutes of Health (NIH) to map the genomic changes associated with specific types of tumors to improve the prevention, diagnosis and treatment of cancer. Its mission is to accelerate the understanding of the molecular basis of cancer through the application of genome analysis and characterization technologies. Data sources Standards/Databases Training
The Data Use Ontology (DUO) The Data Use Ontology (DUO) describes data use requirements and limitations. DUO allows to semantically tag datasets with restriction about their usage, making them discoverable automatically based on the authorization level of users, or intended usage. This resource is based on the OBO Foundry principles, and developed using the W3C Web Ontology Language. It is used in production by the European Genome-phenome Archive (EGA) at EMBL-EBI and CRG as well as the Broad Institute for the Data Use Oversight System (DUOS). Data sources Standards/Databases
toil-cwl-runner The toil-cwl-runner command provides cwl-parsing functionality using cwltool, and leverages the job-scheduling and batch system support of Toil. Data analysis
UCSC Genome Browser An online tool for analyzing and visualizing genomic data. It allows users to add and share annotations. Data analysis An automated SARS-CoV-... Tool info Standards/Databases
VarScan Variant calling and somatic mutation/CNV detection for next-generation sequencing data Data analysis Tool info
VCFtools VCFtools is a program package designed for working with VCF files. Quality control
VEP VEP (Variant Effect Predictor) predicts the functional effects of genomic variants. Data analysis Tool info Training
WfExS Workflow Execution Service Backend (WfExS-backend) is a high-level orchestrator to run scientific workflows reproducibly. Provenance
WorkflowHub A registry for describing, sharing and publishing scientific computational workflows. An automated SARS-CoV-... Tool info Standards/Databases Training
wtdbg2 Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). Data analysis Tool info
XCMS Metabolomic and lipidomic platform Data analysis Tool info Training
Zenodo Zenodo is a generalist research data repository built and developed by OpenAIRE and CERN. It was developed to aid Open Science and is built on open source code. Zenodo helps researchers receive credit by making the research results citable and through OpenAIRE integrates them into existing reporting lines to funding agencies like the European Commission. Citation information is also passed to DataCite and onto the scholarly aggregators. Content is available publicly under any one of 400 open licences (from and Restricted and Closed content is also supported. Free for researchers below 50 GB/dataset. Content is both online on disk and offline on tape as part of a long-term preservation policy. Zenodo supports managed access (with an access request workflow) as well as embargoing generally and during peer review. The base infrastructure of Zenodo is provided by CERN, a non-profit IGO. Projects are funded through grants. Data sources Standards/Databases Training
COVID Epistat

Dashboard for COVID-19 epidemiological data (vaccination, laboratory testing, wastewater, variants, hospitalised patients, mortality, nursing home patients, mental health indicators, seroprevalence).

Tool that provides templates for data management plans.

DMP Online
Epidemiology of Infectious Diseases (Epistat)

A web based application for visualising and exploring data on infectious diseases monitored by Sciensano.

Figures on notifiable infectious diseases

Dashboard for notifiable infectious diseases in Flanders.

Galaxy Belgium

Galaxy Belgium is a Galaxy instance managed by the Belgian ELIXIR node, funded by the Flemish government, which utilises infrastructure provided by the Flemish Supercomputer Center (VSC).

R Markdown

Tool at Sciensano for generating weekly and daily reports.

Sciensano LimeSurvey

Tool for online surveys.

Sciensano R Shiny Apps

Shiny is an R package that makes it easy to build interactive web apps straight from R. At Sciensano, several Shiny Apps have been developed to process, analyse and visualise data during the COVID-19 crisis. These include the Surge App (monitoring and data quality of COVID-19 hospitalisations), Indicator App (COVID-19 indicators based on test positivity rates per province), Coverage App (coverage of clinical database on hospitalised patients), Hospital Indicators (forecasting and profile of hospitalised patients), and Quality of reporting (quality indicators of reporting by individual hospitals).

R Shiny
Clinical-Epidemiological (CE) dataset from an Erasmus MC COVID-19 cohort

Clinical-Epidemiological (CE) data from the Erasmus MC cohort includes 151 PCR-confirmed COVID-19 individuals who were admitted to the hospital with a respiratory infection or respiratory failure in 2020-2021.


Over a period of approximately 2 years (starting May 2020 until February 2022), a total of 280.000 Dutch inhabitants filled in a short questionnaire about Corona-related symptoms, behaviour, vaccination status and test result.

COVID-NL clinical data dashboard

The Dutch national COVID-19 clinical data dashboard allows exploration and reuse of clinical data from Dutch university medical centers (UMCs). The dashboard provides researchers with a clear overview of what is available, allows searching for specific data and makes access to such data easier when the necessary ethical and legal conditions have been met. The policy for access to and sharing of clinical COVID-19 is described in the HRI COVID policy document.

COVID-NL metadata portal

The Dutch national COVID-19 metadata portal describes the content of the collections and type of data. The underlying data remains at the source, but where possible a link to the data or the data request procedure are provided on the portal. The first health care data sets in the portal are coming from observational studies funded by ZonMw, NFU COVID-19 clinical research data, collaborating top clinical hospitals (STZ), as well as other regional hospitals. However, the portal is open to any health care provider wishing to make their COVID-19 data available for research.


The Data Archiving and Networked Services (DANS) is the Dutch national centre of expertise and repository for research data.

Erasmus MC COVID-19 cohort-associated connected datasets study

As part of the multidisciplinary ReCoDID consortium, the study aimed at connecting clinical-epidemiological (CE) data with further datasets from research on many other aspects of SARS-CoV-2.


The Open Data Infrastructure for Social Science and Economic Innovations (ODISSEI) is the national research infrastructure for the social sciences in the Netherlands.

FEGA Norway

Federated European Genome-phenome Archive (EGA) node

European Genome-phenome Archive (EGA)
Folkehelseinstituttet (FHI)

Norwegian Institute of Public Health (NIPH) portal for infectious disease information

SARS-CoV-2 Database

Norwegian SARS-CoV-2 database

Swedish Pathogens Portal

The Swedish Pathogens Portal is a hub for data, tools, services, and other resources centred around pathogens, such as SARS-CoV-2, and pandemic preparedness in Sweden.