Skip to content
Data Packages

Transform Data Chaos into Reusable Assets

Flexible, interoperable data packages that compound value over time

Give teams the power to define their own processes, use their own datatypes, and innovate as quickly as they want. Gain organizational-wide data findability while building reliable, trustable, version-controlled data products.

ccle-test-3/SRR8788981
@ 2f07364e26
Showing 1-5 out of 5
SRR8788981.runinfo_ftp.tsv 1.6 kB
SRX5578768_SRR8788981_1.fastq.gz 66.5 kB
SRX5578768_SRR8788981_2.fastq.gz 66.2 kB
Resilience
90%
faster data lookup

What Are Data Packages?

Data packages are intelligent manifests that combine pointers to data (like objects in Amazon S3) with rich context about that data, including lineage, metadata, and revision history.

If you can't find the data, you can't reproduce the analysis.

Data without context is not reusable. Traditional file storage separates data from its meaning, making collaboration and discovery nearly impossible.

Intelligent Manifests

Combine pointers to data with rich context and lineage

Self-Contained

Everything needed to understand and reproduce the analysis

Versioned

Track every change with cryptographic integrity

Discoverable

Find and access data through metadata and search

Built for AI

Packages are where AI in life sciences starts

A model is only as good as the data it can find and trust. A Quilt package bundles your data with its metadata and a versioned, cryptographic history. That gives models and agents the context they need, and lets you trace any result back to the exact data that produced it.

Packages you can verify

Data, metadata, and an immutable version history in one addressable unit. Every output traces back to an exact dataset, so it stays reproducible and audit-ready for regulated work.

Qurator: search in plain English

Ask for data the way you'd ask a colleague. Qurator searches your governed catalog and returns the right packages, scoped to what each person is allowed to see. No ontology expertise needed.

Bring your own model (MCP)

Connect Claude, ChatGPT, or any MCP-compatible agent to your Quilt data with per-user OAuth. Models can read, visualize, and build on your data, and they only ever see what that user can.

From instrument to model, your data stays in your AWS account: governed, versioned, and ready to use.

What data packages give you

Data packages pair familiar data management with cloud-native storage, so your data stays reliable, easy to find, and ready to reuse.

Data Provenance

Immutable hash-based version history — audit-ready for GxP and 21 CFR Part 11

Open Source

Rich Visualizations

Document previews, dashboards, and in-browser charts anchored to package versions

Platform

Powerful Search

Curator natural-language search across metadata and file contents

Platform

Team Collaboration

Web catalog, role-scoped permissions, and shareable package URIs

Platform
See it in action

Browse packages with full context

Every package combines your files, README, and metadata in one view — backed by S3 in your AWS account, never copied elsewhere.

02 Package catalog

Browse your data like objects in S3, with context

Every package is a self-contained unit: data, a README, rich metadata, and previews. Explore the tree, read the docs, and see exactly what's inside before you pull a byte.

  • File tree, README, and key/value metadata in one view
  • In-browser previews for images, tables, and notebooks
  • Backed by your own S3, so the data never moves

Built for Your Infrastructure

Data packages work with your existing cloud infrastructure and analysis platforms, giving your data a vendor-neutral foundation.

Amazon Web Services

Native S3 integration with advanced AWS technology partnership

Integration

Amazon Web Services

Native S3 integration with advanced AWS technology partnership

Integration

Amazon Web Services

Native S3 integration with advanced AWS technology partnership

Integration

The Data Package Lifecycle

From creation to collaboration

Data packages follow a simple workflow. Start with the free Python SDK for basic packaging, or use the full platform for team collaboration.

1

Create

Bundle data with metadata (SDK) or use web interface (Platform)

2

Version

Track changes with SHA-256 checksums

3

Share

Collaborate across teams and platforms

4

Discover

Access via SDK commands or rich web search (Platform)

Real-World Applications

See how teams across biotech and life sciences use data packages to accelerate discovery and ensure reproducibility.

Genomics Research

Package sequencing data with sample metadata for reproducible analysis pipelines

Key Benefit: Reproducible Analysis
Example: FASTQ, VCF, BAM files + sample metadata

Genomics Research

Package sequencing data with sample metadata for reproducible analysis pipelines

Key Benefit: Reproducible Analysis
Example: FASTQ, VCF, BAM files + sample metadata

Genomics Research

Package sequencing data with sample metadata for reproducible analysis pipelines

Key Benefit: Reproducible Analysis
Example: FASTQ, VCF, BAM files + sample metadata

Genomics Research

Package sequencing data with sample metadata for reproducible analysis pipelines

Key Benefit: Reproducible Analysis
Example: FASTQ, VCF, BAM files + sample metadata
Measured impact

Outcomes teams see with Quilt

90%
faster data lookup
Resilience
NGS analysis throughput
Tessera
Weeks → minutes
from instrument to AI-ready package
30+
biotech & pharma teams
incl. Allen Institute, Inari

Trusted by life-sciences organizations

Data lookups that used to take our scientists days now take minutes, with a single, governed source of truth the whole team can trust.
90% faster data lookup Data Platform team, Resilience

From Expendable Resource to Reusable Asset

AI and Machine Learning are creating new opportunities to answer much broader questions than lab data was originally intended for. To be competitive in biotech, using data beyond its original scope is no longer just nice to have. It's an imperative.

Beyond the lab

Extend the rigor of your data beyond instruments, spreadsheets, and scattered hard drives.

Reusable assets

Versioned packages turn one-off datasets into durable, shareable assets your team can trust.

AI-ready by default

Governed, contextual data that your models and agents can actually use.