Skip to content Skip to footer

Tools and resources list

Tool or resource Description Related pages Registry
1+ Million Genomes (1+MG) The 1+ Million Genomes (1+MG) initiative aims to enable secure access to genomics and the corresponding clinical data across Europe for better research, personalised healthcare and health policy making. Since the Digital Day 2018, 25 EU countries, the UK and Norway signed Member States declaration on stepping up efforts towards creating a European data infrastructure for genomic data and implementing common national rules enabling federated data access. The initiative forms part of the EU's agenda for the Digital Transformation of Health and Care and is aligned with the goals of the European Health Data Space. Training
ACE Cohort Asymptomatic COVID-19 in Education (ACE) Cohort Human biomolecular data Training
ADA-M Responsible sharing of biomedical data and biospecimens via the Automatable Discovery and Access Matrix (ADA-M). The Automatable Discovery and Access Matrix (ADA-M) provides a standardized way to unambiguously represent the conditions related to data discovery and access. By adopting ADA-M, data custodians can generally describe what their data are (the Header section), who can access them (the Permissions section), terms related to their use (the Terms section), and special conditions (the Meta-Conditions). By doing so, data custodians can participate in data sharing and collaboration by making meta information about their data computer-readable and hence directly available for digital communication, searching and automation activities. Human biomolecular data Tool info
ANNOVAR ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes. Human biomolecular data Tool info
ArrayExpress ArrayExpress is a database of functional genomics experiments that can be queried and the data downloaded. It includes gene expression data from microarray and high throughput sequencing studies. Data is collected to MIAME and MINSEQE standards. Experiments are submitted directly to ArrayExpress or are imported from the NCBI GEO database. Human biomolecular data Linked pathogen and ho... Tool info Standards/Databases Training
Arvados With Arvados, bioinformaticians run and scale compute-intensive workflows, developers create biomedical applications, and IT administrators manage large compute and storage resources. Human biomolecular data
Bcftools Bcftools is a set of tools for working with variant calls in the VCF format. Human biomolecular data
Beacon v2 Beacon v2 is a protocol/specification established by the Global Alliance for Genomics and Health initiative (GA4GH) that defines an open standard for federated discovery of genomic data and associated information in biomedical research and clinical applications. Human biomolecular data Human clinical and hea... Tool info Standards/Databases Training
Beyond 1 Million Genomes (B1MG) The Beyond 1 Million Genomes (B1MG) project is helping to create a network of genetic and clinical data across Europe. The project provides coordination and support to the 1+ Million Genomes Initiative (1+MG). This initiative is a commitment of 24 EU countries, the UK and Norway to give cross-border access to one million sequenced genomes by 2022.
BioGRID BioGRID is a comprehensive biomedical repository for curated protein, genetic and chemical interactions Human biomolecular data Tool info Standards/Databases
BioSamples BioSamples stores and supplies descriptions and metadata about biological samples used in research and development by academia and industry. Samples are either 'reference' samples (e.g. from 1000 Genomes, HipSci, FAANG) or have been used in an assay database such as the European Nucleotide Archive (ENA) or ArrayExpress. It provides links to assays and specific samples, and accepts direct submissions of sample information. Human biomolecular data Linked pathogen and ho... Tool info Standards/Databases Training
BioStudies The BioStudies database holds descriptions of biological studies, links to data from these studies in other databases at EMBL-EBI or outside, as well as data that do not fit in the structured archives at EMBL-EBI. The database can accept a wide range of types of studies described via a simple format. It also enables manuscript authors to submit supplementary information and link to it from the publication. Linked pathogen and ho... Tool info Standards/Databases Training
Bismark Bismark is a program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step. Human biomolecular data Tool info Training
Bitbucket Git based code hosting and collaboration tool, built for teams. Human biomolecular data Standards/Databases
Bowtie2 Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. Human biomolecular data Tool info Training
BWA BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome. Human biomolecular data Tool info Training
Canu Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing. Human biomolecular data Tool info
CESSDA Data Catalogue (CDC) CDC is a one-stop shop for searching and finding European social science data. Socioeconomic data Tool info Standards/Databases
CESSDA Vocabulary Service CESSDA Vocabulary Service enables users to discover, browse, and download controlled vocabularies in a variety of languages. The service is provided by the Consortium of European Social Science Data Archives (CESSDA). The majority of the source (English) vocabularies included in the service have been created by the DDI Alliance. The Data Documentation Initiative (DDI) is an international standard for describing data produced by surveys and other observational methods in the social, behavioural, economic, and health sciences. Standards/Databases
ClustalW ClustalW is a progressive multiple sequence alignment tool to align a set of sequences by repeatedly aligning pairs of sequences and previously generated alignments. Human biomolecular data Tool info Training
Common Workflow Language (CWL) An open standard for describing workflows that are build from command line tools Human clinical and hea... Standards/Databases Training
COMPSs COMP Superscalar (COMPSs) is a task-based programming model which aims to ease the development of applications for distributed infrastructures, such as large High-Performance clusters (HPC), clouds and container managed clusters. Human clinical and hea... Tool info
Covid-19 data portal The COVID-19 Data Portal enables researchers to upload, access and analyse COVID-19 related reference data and specialist datasets. The aim of the COVID-19 Data Portal is to facilitate data sharing and analysis, and to accelerate coronavirus research. The portal includes relevant datasets submitted to EMBL-EBI as well as other major centres for biomedical data. The COVID-19 Data Portal is the primary entry point into the functions of a wider project, the European COVID-19 Data Platform. Human biomolecular data Human clinical and hea... Socioeconomic data The Swedish Pathogens ... Tool info Standards/Databases Training
CRG COVID-19 Viral Beacon A platform allowing for browsing SARS-CoV-2 variability at the genome, amino acid, structural, and motif levels An automated SARS-CoV-...
Cromwell Cromwell is a Workflow Management System geared towards scientific workflows. Human biomolecular data
cwltool Reference implementation to provide comprehensive validation of CWL files as well as provide other tools related to working with CWL. Human biomolecular data
Cytoscape Cytoscape provides a solid platform for network visualization and analysis Human biomolecular data Tool info Training
DAGitty DAGitty is a browser-based environment for creating, editing, and analyzing causal diagrams (also known as directed acyclic graphs or causal Bayesian networks). Prototyping federated ... Tool info Standards/Databases
Danish Research Health Data Gateway Tool that provides the user with an overview of available health data in Denmark and the entire application process, from initial idea to final application. Human clinical and hea...
DAVID The Database for Annotation, Visualization and Integrated Discovery (DAVID) provides a comprehensive set of functional annotation tools for investigators to understand the biological meaning behind large lists of genes. Human biomolecular data Tool info Training
dbGaP The Database of Genotypes and Phenotypes (dbGaP) archives and distributes the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. Human biomolecular data Tool info Standards/Databases Training
dbNSFP A comprehensive database of transcript-specific functional predictions and annotations for human non-synonymous and splice-site SNVs Human biomolecular data Tool info
DCAT An RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. By using DCAT to describe datasets in data catalogs, publishers increase discoverability and enable applications easily to consume metadata from multiple catalogs. Human biomolecular data Standards/Databases
DeepVariant DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file. Human biomolecular data Tool info
Delly Delly is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read and long-read massively parallel sequencing data. Human biomolecular data Tool info
DESeq2 Differential gene expression analysis based on the negative binomial distribution Human biomolecular data Tool info Training
DMP Online DMP online is an online planning tool to help you write an effective DMP based on an institutional or funder template. Training
Docker Docker is a software for the execution of applications in virtualized environments called containers. It is linked to DockerHub, a library for sharing container images Human biomolecular data Standards/Databases Standards/Databases Training
Dragen-GATK DRAGEN-GATK Best Practices contains open-source workflows that are compatible between Illumina's platforms and mainstream infrastructure. Human biomolecular data
Dryad Dryad is an open-source, community-led data curation, publishing, and preservation platform for CC0 publicly available research data. Dryad has a long-term data preservation strategy, and is a Core Trust Seal Certified Merritt repository with storage in US and EU at the San Diego Supercomputing Center, DANS, and Zenodo. While data is undergoing peer review, it is embargoed if the related journal requires / allows this. Dryad is an independent non-profit that works directly with: researchers to publish datasets utilising best practices for discovery and reuse; publishers to support the integration of data availability statements and data citations into their workflows; and institutions to enable scalable campus support for research data management best practices at low cost. Costs are covered by institutional, publisher, and funder members, otherwise a one-time fee of $120 for authors to cover cost of curation and preservation. Dryad also receives direct funder support through grants. Human biomolecular data Standards/Databases
Dutch COVID-19 Data Portal The dutch COVID-19 Data Portal provides researchers with a clear overview of what is available, allow searching for specific data and make access to such data easier when the necessary ethical and legal conditions have been met. Human biomolecular data
EBI The European Bioinformatics Institute is a bioinformatics research center that is part of the European Molecular Biology Laboratory and is located in Hinxton, England. The institution combines intense research activity with the development and maintenance of a set of bioinformatics lines, services and databases. Human biomolecular data Training
EdgeR Empirical Analysis of Digital Gene Expression Data in R Human biomolecular data Tool info Training
ENA upload tool Submits experimental data and respective metadata to the European Nucleotide Archive (ENA). SARS-CoV-2 sequencing ...
Enrichr Functional Enrichment Analysis and Network Construction Tool info
Estonian Biobank The Estonian Biobank has established a population-based biobank of Estonia with a current cohort size of more than 200,000 individuals (genotyped with genome-wide arrays), reflecting the age, sex and geographical distribution of the adult Estonian population. Considering the fact that about 20% of Estonia's adult population has joined the programme, it is indeed a database that is very important for the development of medical science both domestically and internationally. Human biomolecular data
Estonian COVID-19 Data Portal Estonian instance of the COVID-19 Data Portal. Among other information, served Estonian SARS-CoV-2 sequencing dashboards. SARS-CoV-2 sequencing ... The Swedish Pathogens ...
EUI COVID-19 SSH Data Portal The COVID-19 SSH Data Portal provides integrated search, discovery, and linking to datasets published on the web relevant for COVID-19-related research in the Social Sciences and Humanities. Socioeconomic data Standards/Databases
European Centre for Disease Prevention and Control (ECDC) It is an EU agency aimed at strengthening Europe's defences against infectious diseases. Their mission is to identify, assess and communicate current and emerging threats to human health posed by infectious diseases. Human clinical and hea...
European Clinical Research Infrastructure Network (ECRIN) tools ECRIN develops, contributes to, and maintains freely accessible tools that facilitate the identification of clinical trial objects, data sharing, access to regulatory and methodological designs and much more to support researchers looking to conduct multinational clinical research. Human clinical and hea...
European Genome-phenome Archive (EGA) The European Genome-phenome Archive (EGA) is a service for permanent archiving and sharing of personally identifiable genetic, phenotypic, and clinical data generated for the purposes of biomedical research projects or in the context of research-focused healthcare systems. Access to data must be approved by the specified Data Access Committee (DAC). Human biomolecular data Human clinical and hea... Linked pathogen and ho... Tool info Standards/Databases Training
European Health Data Space (EHDS) The European Health Data Space is a health specific ecosystem comprised of rules, common standards and practices, infrastructures and a governance framework that aims at empowering individuals through increased digital access to and control of their electronic personal health data, at national level and EU-wide. Human clinical and hea...
European Health Information Portal The Health Information Portal provides access to population health and healthcare data across Europe. Human clinical and hea... Standards/Databases
European Language Social Science Thesaurus (ELSST) The European Language Social Science Thesaurus (ELSST) is a broad-based, multilingual thesaurus for the social sciences. It is owned and published by the Consortium of European Social Science Data Archives (CESSDA) and its national Service Providers. The thesaurus consists of over 3,300 concepts and covers the core social science disciplines: politics, sociology, economics, education, law, crime, demography, health, employment, information, communication technology, and environmental science. ELSST is used for data discovery within CESSDA and facilitates access to data resources across Europe, independent of domain, resource, language, or vocabulary. ELSST is currently available in 16 languages: Danish, Dutch, Czech, English, Finnish, French, German, Greek, Hungarian, Icelandic, Lithuanian, Norwegian, Romanian, Slovenian, Spanish, and Swedish Standards/Databases
European Medicines Agency (EMA) The European Medicines Agency (EMA) is a decentralised agency of the European Union (EU). It is responsible for the scientific evaluation, supervision and safety monitoring of medicines. Human clinical and hea...
European Nucleotide Archive (ENA) Provides a record of the nucleotide sequencing information. It includes raw sequencing data, sequence assembly information and functional annotation. Pathogen characterisation Human biomolecular data An automated SARS-CoV-... SARS-CoV-2 sequencing ... Linked pathogen and ho... Tool info Standards/Databases Training
FAIRsharing FAIRsharing is a FAIR-supporting resource that provides an informative and educational registry on data standards, databases, repositories and policy, alongside search and visualization tools and services that interoperate with other FAIR-enabling resources. fairsharing guides consumers to discover, select and use standards, databases, repositories and policy with confidence, and producers to make their resources more discoverable, more widely adopted and cited. Each record in fairsharing is curated in collaboration with the maintainers of the resource themselves, ensuring that the metadata in the fairsharing registry is accurate and timely. Every record is manually reviewed at least once a year. Records can be collated into collections, based on a project, society or organisation, or Recommendations, where they are collated around a policy, such as a journal or funder data policy. Pathogen characterisation Human biomolecular data Standards/Databases Training
FASTQC A quality control tool for high throughput sequence data. Human biomolecular data Pathogen characterisation Tool info Training
fastQC Screen FastQScreen is a quality control tool used to detect contamination in sequencing data (FASTQ files). Human biomolecular data
Federated EGA The Federated EGA is an infrastructure built upon the European Genome-phenome Archive (EGA), an EMBL-EBI and CRG data resource for secure archiving and sharing of human sensitive biomolecular and phenotypic data resulting from biomedical research projects. Human biomolecular data Human clinical and hea... Training
Figshare Figshare is a generalist, subject-agnostic repository for many different types of digital objects that can be used without cost to researchers. Data can be submitted to the central figshare repository (described here), or institutional repositories using the figshare software can be installed locally, e.g. by universities and publishers. Metadata in figshare is licenced under is CC0. figshare has also partnered with DuraSpace and Chronopolis to offer further assurances that public data will be archived under the stewardship of Chronopolis. figshare is supported through Institutional, Funder, and Governmental service subscriptions. Human biomolecular data Standards/Databases Training
Findata Findata is the data permit authority for the social and health care sector in Finland. Human clinical and hea...
Flye Flye is a de novo assembler for single-molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. Human biomolecular data Tool info Training
FreeBayes FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment. Human biomolecular data Tool info Training
French Health Data Hub French Health Data Hub that guarantees easy and unified, transparent and secure access to health data to improve the quality of care and patient support. Human clinical and hea...
GA4GH The metadata model for GA4GH, an international coalition of both public and private interested parties, formed to enable the sharing of genomic and clinical data. Tool info Standards/Databases Training
Galaxy Open, web-based platform for data intensive biomedical research. Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses. Human biomolecular data Human clinical and hea... Tool info Training
Galaxy Europe The European Galaxy server. Provides access to thousands of tools for scalable and reproducible analysis. An automated SARS-CoV-...
Galaxy University of Tartu The University of Tartu Galaxy instance. Enables local university users to run their analyses in the Galaxy environment. Was heavily used during the KoroGenoEST sequencing studies. SARS-CoV-2 sequencing ...
GenBank GenBank is the NIH genetic sequence database of annotated collections of all publicly available DNA sequences. Human biomolecular data Tool info Standards/Databases Training
GeneMANIA GeneMANIA helps you predict the function of your favourite genes and gene sets. Human biomolecular data Tool info Training
Genome Analysis Toolkit (GATK) GATK is a widely used tool for variant calling and genotyping from NGS data. Human biomolecular data
Genomic Data Infrastructure (GDI) The Genomic Data Infrastructure (GDI) project is enabling access to genomic and related phenotypic and clinical data across Europe. It is doing this by establishing a federated, sustainable and secure infrastructure to access the data. It builds on the outputs of the Beyond 1 Million Genomes (B1MG) project and is realising the ambition of the 1+Million Genomes (1+MG) initiative.
GEO The Gene Expression Omnibus (GEO) is a public repository that archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomic data submitted by the scientific community. Accepts next generation sequence data that examine quantitative gene expression, gene regulation, epigenomics or other aspects of functional genomics using methods such as RNA-seq, miRNA-seq, ChIP-seq, RIP-seq, HiC-seq, methyl-seq, etc. GEO will process all components of your study, including the samples, project description, processed data files, and will submit the raw data files to the Sequence Read Archive (SRA) on the researchers behalf. In addition to data storage, a collection of web-based interfaces and applications are available to help users query and download the studies and gene expression patterns stored in GEO. Human biomolecular data Standards/Databases Training
ggplot2 ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. Human biomolecular data Tool info Training
GitHub GitHub is a versioning system, used for sharing code, as well as for sharing of small data. Human biomolecular data Standards/Databases Standards/Databases Training
GitLab GitLab is an open source end-to-end software development platform with built-in version control, issue tracking, code review, CI/CD, and more. Self-host GitLab on your own servers, in a container, or on a cloud provider. Human biomolecular data Standards/Databases Training
Global Initiative on Sharing All Influenza Data (GISAID) A web-based platform for sharing viral sequence data, initially for influenza data, and now for other pathogens (including SARS-CoV-2). Pathogen characterisation Standards/Databases
GO GO is to perform enrichment analysis on gene sets. Human biomolecular data Tool info Training
GRAF pop GRAF pop is a software tool that infers the subject ancestry. Human biomolecular data
GRAF sex Tool that determines subject sexes using the genotypes. Human biomolecular data
GRIDSS GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements. Human biomolecular data Tool info
GSEA Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states Human biomolecular data Tool info Training
GTEx The Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a comprehensive public resource to study tissue-specific gene expression and regulation. Samples were collected from 53 non-diseased tissue sites across nearly 1000 individuals, primarily for molecular assays including WGS, WES, and RNA-Seq. Remaining samples are available from the GTEx Biobank. The GTEx Portal provides open access to data including gene expression, QTLs, and histology images. Human biomolecular data Tool info Standards/Databases Training
Health Research Data UK HDR UK is a national institute with the aim to unite the UK’s health and care data to enable discoveries that improve people’s lives. We do this by uniting, improving and using health and care data as one national institute. Human clinical and hea...
HISAT2 HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (whole-genome, transcriptome, and exome sequencing data) to a population of human genomes (as well as to a single reference genome). Human biomolecular data Tool info Training
IGV The Integrative Genomics Viewer (IGV) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data. Human biomolecular data Tool info Training
IntAct IntAct (Molecular Interaction Database) Website Human biomolecular data Tool info Standards/Databases Training
ISARIC COVID-19 Case Report Form The ISARIC-WHO Case Report Forms (CRFs) should be used to collect data on individuals presenting with suspected or confirmed COVID-19, with the aim to standardise clinical data to improve patient care and inform the public health response. Linked pathogen and ho... Standards/Databases
KEGG A set of annotation maps for Kyoto encyclopedia of genes and genomes (KEGG) Human biomolecular data Tool info Training
LimeSurvey LimeSurvey is a free and open source advanced online survey system to create online surveys.
Lumpy A probabilistic framework for structural variant discovery. Human biomolecular data Tool info
MACS Model-based Analysis of ChIP-Seq (MACS), for identifying transcript factor binding sites. Human biomolecular data Tool info Training
MAFFT MAFFT is a multiple sequence alignment program Human biomolecular data Tool info
Manta Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. Human biomolecular data Tool info
matplotlib Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Human biomolecular data Tool info
MetaboAnalyst MetaboAnalyst is a comprehensive platform dedicated for metabolomics data analysis via user-friendly, web-based interface. Human biomolecular data Tool info Training
Metagen-FastQC Cleans metagenomic reads to remove adapters, low-quality bases and host (e.g. human) contamination. SARS-CoV-2 sequencing ...
MethylKit methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. Human biomolecular data Tool info
methylPipe Base resolution DNA methylation data analysis Human biomolecular data Tool info
MetSign A computational platform for high-resolution mass spectrometry-based metabolomics Human biomolecular data
MIABIS MIABIS represents the minimum information required to initiate collaborations between biobanks and to enable the exchange of biological samples and data. The aim is to facilitate the reuse of bio-resources and associated data by harmonizing biobanking and biomedical research. Human biomolecular data Standards/Databases
MultiQC MultiQC searches a given directory for analysis logs and compiles a HTML report. Pathogen characterisation Tool info Training
MUSCLE MUSCLE is widely-used software for making multiple alignments of biological sequences. Human biomolecular data Tool info Training
Mzmine MZmine 3 is an open-source software for mass-spectrometry data processing, with the main focus on LC-MS data. Human biomolecular data Tool info
NCBI The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information. Training
Nextflow Nextflow is a framework for data analysis workflow execution Human biomolecular data Tool info Training
Nextstrain Auspice Estonian local instance of the Nextstrain Auspice application that serves SARS-CoV-2 phylogenetic data SARS-CoV-2 sequencing ...
ODDISEI Secure ANalysis Environment (SANE) SANE is a virtual container in which the researcher can analyse sensitive data, while the data owner retains full control. Socioeconomic data
Omicsgenerator Omics Integrator is a package designed to integrate proteomic data, gene expression data and/or epigenetic data using a protein-protein interaction network. Human biomolecular data
OpenMS OpenMS is an open-source software C++ library for LC-MS data management and analyses. Human biomolecular data Tool info Training
Panther The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence. Human biomolecular data Tool info Standards/Databases Training
Pathogens Portal The Pathogens Portal, launched in July 2023, is an invaluable resource for researchers, clinicians, and policymakers who need access to the latest and most comprehensive datasets on pathogens. The portal is a collaborative effort between the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) and partners. Linked pathogen and ho... The Swedish Pathogens ... Standards/Databases
Pathogens Portal Cohort Browser The Pathogens Portal Cohort Browser presents discovery metadata of infectious disease cohort datasets and provides links to the associated datasets within ELIXIR Core Data Resources; search and filtering functionalities enable users to identify cohort studies of interest in a convenient manner. Linked pathogen and ho...
PhyML PhyML is a software package that uses modern statistical approaches to analyse alignments of nucleotide or amino acid sequences in a phylogenetic framework. Human biomolecular data Tool info
Picard Picard is a suite of tools that provides quality control and processing of NGS data, including duplicate read removal, format conversion, and alignment. Human biomolecular data
Population Health Information Research Infrastructure (PHIRI) PHIRI is the roll-out of the research infrastructure on population health information that aims to facilitate and generate the best available evidence for research on health and well-being of populations as impacted by COVID-19. Human clinical and hea...
Qualimap Qualimap is a quality control tool that assesses the quality of the sequencing data at different stages of the analysis pipeline, including read mapping, coverage, and expression analysis. Human biomolecular data
Quarto Quarto is an open-source scientific and technical publishing system that enables the creation of dynamic and reproducible content. Prototyping federated ... Training
R Markdown R Markdown can help to turn your analyses into high quality documents, reports, presentations and dashboards. Training
R Shiny Shiny is an R package that makes it easy to build interactive web apps straight from R. Training
Research Data Centre at the BfArM Research at the BfArM concentrates on important and contemporary research focal points with regard to the marketing authorisation of medicinal products and improving the safety thereof as well as concerning the recording and assessment of risks in connection with medical devices. Human clinical and hea...
Research Object Crate (RO-Crate) RO-Crate is a lightweight approach to packaging research data with their metadata, using An RO-Crate is a structured archive of all the items that contributed to the research outcome, including their identifiers, provenance, relations and annotations. Human clinical and hea... Standards/Databases
SAMtools Samtools is a suite of programs for interacting with high-throughput sequencing data. Human biomolecular data Pathogen characterisation Tool info Training
Sapporo WES Implementation of Workflow Execution Service (WES) or so-called Workflow-as-a-Service. Human clinical and hea... Tool info
SARS-CoV-2 Data Hubs Using technology that builds upon existing EMBL-EBI infrastructure, we provide SARS-CoV-2 Data Hubs to those public health agencies and other scientific groups responsible for generating viral sequence data from the outbreak at national or regional levels. Human biomolecular data is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond. Human clinical and hea... Standards/Databases Training
SICER2 Redesigned and improved ChIP-seq broad peak calling tool SICER Human biomolecular data
Singularity Singularity is a widely-adopted container runtime that implements a unique security model to mitigate privilege escalation risks and provides a platform to capture a complete application environment into a single file (SIF) Human biomolecular data Training
Snakemake Snakemake is a framework for data analysis workflow execution Human biomolecular data Human clinical and hea... Tool info Training
SnpEff Genetic variant annotation and functional effect prediction toolbox. It annotates and predicts the effects of genetic variants on genes and proteins. Human biomolecular data Tool info Training
SPAdes SPAdes is an assembly toolkit containing various assembly pipelines. Human biomolecular data Tool info Training
SRA The SRA is NIH's primary archive of high-throughput sequencing data and is part of the International Nucleotide Sequence Database Collaboration (INSDC) that includes at the NCBI Sequence Read Archive (SRA), the European Bioinformatics Institute (EBI), and the DNA Database of Japan (DDBJ). Data submitted to any of the three organizations are shared among them. SRA accepts data from all kinds of sequencing projects including clinically important studies that involve human subjects or their metagenomes, which may contain human sequences. These data often have a controlled access via dbGaP (the database of Genotypes and Phenotypes). Human biomolecular data Tool info Standards/Databases Training
STAR Spliced Transcripts Alignment to a Reference Human biomolecular data Tool info Training
StreamFlow Container-native workflow manager for hybrid infrastructures Human clinical and hea...
Swedish Pathogens Portal The Swedish Pathogens Portal was previously known as the Swedish COVID-19 Data Portal. It is the Swedish national node of the Pathogens Portal, aimed at facilitating the sharing of data related to pathogens and pandemic preparedness. The Swedish Pathogens ... Standards/Databases
TCGA The Cancer Genome Atlas (TCGA) is a comprehensive, collaborative effort led by the National Institutes of Health (NIH) to map the genomic changes associated with specific types of tumors to improve the prevention, diagnosis and treatment of cancer. Its mission is to accelerate the understanding of the molecular basis of cancer through the application of genome analysis and characterization technologies. Human biomolecular data Standards/Databases Training
The Data Use Ontology (DUO) The Data Use Ontology (DUO) describes data use requirements and limitations. DUO allows to semantically tag datasets with restriction about their usage, making them discoverable automatically based on the authorization level of users, or intended usage. This resource is based on the OBO Foundry principles, and developed using the W3C Web Ontology Language. It is used in production by the European Genome-phenome Archive (EGA) at EMBL-EBI and CRG as well as the Broad Institute for the Data Use Oversight System (DUOS). Human biomolecular data Standards/Databases
The European Health Information Gateway The European Health Information Gateway is a platform that provides access to various health information resources and datasets from across Europe, including data on health systems, health determinants, and health outcomes. Human clinical and hea...
The National health service metadata catalogue Tool to find health data in the metadata catalogue in Portugal. Human clinical and hea...
The Public Service Data Catalogue Tool to discover the Data held by the Irish Public Service Human clinical and hea...
toil-cwl-runner The toil-cwl-runner command provides cwl-parsing functionality using cwltool, and leverages the job-scheduling and batch system support of Toil. Human biomolecular data
Trimmomatic Trimmomatic is a tool used for the removal of adapter sequences, low-quality reads, and sequences with ambiguous bases from NGS data. Human biomolecular data
UCSC Genome Browser An online tool for analyzing and visualizing genomic data. It allows users to add and share annotations. Human biomolecular data An automated SARS-CoV-... Tool info Standards/Databases
VarScan Variant calling and somatic mutation/CNV detection for next-generation sequencing data Human biomolecular data Tool info
VCFtools VCFtools is a program package designed for working with VCF files. Pathogen characterisation Tool info
VEP VEP (Variant Effect Predictor) predicts the functional effects of genomic variants. Human biomolecular data Tool info Training
WfExS Workflow Execution Service Backend (WfExS-backend) is a high-level orchestrator to run scientific workflows reproducibly. Human clinical and hea...
WorkflowHub A registry for describing, sharing and publishing scientific computational workflows. An automated SARS-CoV-... Tool info Standards/Databases Training
wtdbg2 Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). Human biomolecular data Tool info
XCMS Metabolomic and lipidomic platform Human biomolecular data Tool info Training
Zenodo Zenodo is a generalist research data repository built and developed by OpenAIRE and CERN. It was developed to aid Open Science and is built on open source code. Zenodo helps researchers receive credit by making the research results citable and through OpenAIRE integrates them into existing reporting lines to funding agencies like the European Commission. Citation information is also passed to DataCite and onto the scholarly aggregators. Content is available publicly under any one of 400 open licences (from and Restricted and Closed content is also supported. Free for researchers below 50 GB/dataset. Content is both online on disk and offline on tape as part of a long-term preservation policy. Zenodo supports managed access (with an access request workflow) as well as embargoing generally and during peer review. The base infrastructure of Zenodo is provided by CERN, a non-profit IGO. Projects are funded through grants. Human biomolecular data Standards/Databases Training