Home AWS Amazon Omics: A Service to Store and Process Biological Data at Scale

Amazon Omics: A Service to Store and Process Biological Data at Scale

Amazon Omics 3

In November 2022 at its annual re:Invent conference, Amazon Web Services announced general availability of a new platform — Amazon Omics. This purpose-built service aims to help clinicians, bioinformaticians, and scientists process huge sets of omics data, run large-scale analysis and generate actionable insights much faster. 

Omics is the general term describing different fields of biological study that end with omics — genomics (the study of a person’s genes), proteomics (the study of proteins), metabolomics (the study of small-molecule metabolites), metagenomics (the study of genomes in an environmental sample), phenomics (the study of phenotypes), and transcriptomics (the study of RNA transcripts).

Omics Data Types

Source: Altexsoft

From predictive biomarkers identification to personalized treatment strategies to accelerated drug development, multi-omics datasets hold great potential in powering discovery across multiple areas. However, extracting insights from the swelling treasure troves of multi-omics data that can be in the order of petabytes is difficult and requires immense amounts of computing power. In addition, ensuring privacy and security of this data can also be an uphill battle. And that’s where Amazon Omics can help.

Building blocks of Amazon Omics service

Amazon Structure

Source: AWS

Data storage

The first component of Amazon Omics is cost-effective, omics-optimized data storage for storing and sharing reference and raw sequence data in a variety of formats (FASTQ, BAM and CRAM). There are basically two storage classes — active for readily available data and archive for low-cost and long-term data storage. Enabled with an auto-archival feature, Amazon Omics will transfer your data to a cheaper storage class if they are not used for over 30 days, which results in significant cost savings and optimized TCO for clients.

Bioinformatics workflows

By providing managed compute resources, Amazon Omics eliminates the heavy lifting related to running large-scale bioinformatics workflows like gene expression or variant calling. Once the client specifies the workflow definitions, the tools to use and the data to analyze, Amazon Omics provides all the necessary infrastructure. The workflow engine supports Nextflow and Workflow Description Language (WDL), which are domain-specific languages. Besides, clients can control who has access to specific workflows and prioritize workflows implementation through run groups.


In addition to a secure and low-cost place to store data and support for bioinformatics workflows at scale, Amazon Omics allows research teams and scientists to ingest and transform genomic data into query-ready schemas for further use with a variety of AWS analytics services like Amazon Athena and Amazon SageMaker. Amazon Omics analytics stores can import (g)VCF-formatted files for variant data (data from an individual sample) and VCF, GFF, and TSV/CSV files for genomic annotations (known information about positions in the genome).

Key benefits

With Google’s Cloud Life Sciences and other bioinformatics platforms like Galaxy, DNAnexus, and Basespace, the market for bioinformatics analysis platforms is becoming crowded. Amazon Omics stands out through its managed service approach that takes the complexity of provisioning, managing, and scaling the entire infrastructure out of the equation, allowing scientists to focus on what they do best. Other benefits include:

  • Built-in security, privacy and compliance. For AWS, security is the highest priority. The new service is built with robust safeguards in mind like encryption by default, AWS-owned and customer-managed KMS keys, identity and access management, and more. In terms of compliance, Amazon Omics is GDPR compliant and HIPAA eligible.
  • Population-level scale. Amazon Omics is designed to support large-scale population data analyses and collaborative research while offering simplified billing models.
  • Multiomics and multimodal analysis. The new service allows researchers to prepare omics data for analysis and bring together multimodal data like electronic health records and imaging to generate deeper insights.

Amazon Omics: Top use cases

Despite the fact that Amazon Omics was publicly launched a few months ago, its merits have already been publicly recognized by many top-tier healthcare providers like The Children’s Hospital of Philadelphia, G42 Healthcare, Ovation, Element Biosciences, Sentieon, C2i Genomics, and others. Amazon Omics’ clients leverage the platform to boost productivity and maximize efficiency of workflows, whether it is building customizable genomic pipelines or processing sequencing data at scale

Accelerating clinical trials and drug discovery

The Human Genome Project completed in 2003 dramatically transformed the drug development process, resulting in significant growth in genomic-based drug research. By bringing genomic data into clinical trials, scientists can better understand new drug efficacy, account for variability in drug response and optimize treatment strategies. Amazon Omics makes integration of multiomics data easier and more cost-effective while also enabling data provenance to meet regulations from regulatory authorities.

Delivering personalized care

To deliver the best care possible, it is critical to get a comprehensive, 360-degree view of a patient. Amazon Omics helps healthcare and life science professionals bring together various omics data, imaging and clinical records, powering multimodal and multiomics analysis. Armed with these in-depth insights, healthcare providers can discover early signs of an underlying cognition, determine a patient’s risk of developing a disease and come up with a personalized treatment plan.

Enabling collaborative research

Amazon Omics also creates an ecosystem that streamlines collaboration between research groups and clinical teams since they don’t have to set up complex infrastructure from the ground up. And with built-in access controls, researchers can share anonymized omics data in a more secure manner.

Wrapping up

For decades Amazon has been empowering healthcare and life sciences with purpose-built services and solutions that improve operational and clinical efficiency, improve patient outcomes, and decrease the cost of care. Amazon Omics, the latest Amazon’s big bet on bioinformatics, is another managed service that aims to take the heavy lifting of technical processes like setting and maintaining infrastructure for bioinformatics workflows and large-scale multiomics analysis so that clinicians can focus on research.

For over 30 years, Kanda has been helping digital health startups and established healthcare providers deliver high-quality, HIPAA-compliant solutions. Throughout this time, we have accumulated in-depth expertise around major healthcare challenges, from implementing cloud-based healthcare systems to integrating AI-powered healthcare analytics to ensuring regulatory compliance and interoperability.

Talk to our experts to discuss the details of your healthcare project.

Back to All Posts