Transform Data Chaos into Reusable Assets
Give teams the power to define their own processes, use their own datatypes, and innovate as quickly as they want. Gain organizational-wide data findability while building reliable, trustable, version-controlled data products.
INFINITE POTENTIAL: THE DATA REVOLUTION
A PARADIGM SHIFT
DATA: REIMAGINED
If you can't find the data, you can't reproduce the analysis.
What Are Data Packages?
Data packages are intelligent manifests that combine pointers to data (like objects in Amazon S3) with rich context about that data, including lineage, metadata, and revision history.
If you can't find the data, you can't reproduce the analysis.
Data without context is not reusable. Traditional file storage separates data from its meaning, making collaboration and discovery nearly impossible.
Intelligent Manifests
Combine pointers to data with rich context and lineage
Self-Contained
Everything needed to understand and reproduce the analysis
Versioned
Track every change with cryptographic integrity
Discoverable
Find and access data through metadata and search
What Makes Data Packages Powerful
Data packages combine the best of traditional data management with modern cloud-native capabilities, creating a foundation for reliable, discoverable, and reusable data assets.
Data Packaging
Create versioned packages with metadata using the free Python SDK
Rich Visualizations
Document previews, dashboards
Powerful Search
Metadata-driven discovery
Team Collaboration
Web interface, permissions
Built for Your Infrastructure
Data packages integrate seamlessly with your existing cloud infrastructure and analysis platforms, providing a vendor-neutral foundation for your data assets.
Amazon Web Services
Native S3 integration with advanced AWS technology partnership
Amazon Web Services
Native S3 integration with advanced AWS technology partnership
Amazon Web Services
Native S3 integration with advanced AWS technology partnership
The Data Package Lifecycle
Data packages follow a simple yet powerful workflow. Start with the free Python SDK for basic packaging, or use the full platform for advanced collaboration.
Create
Bundle data with metadata (SDK) or use web interface (Platform)
Version
Track changes with SHA-256 checksums
Share
Collaborate across teams and platforms
Discover
Access via SDK commands or rich web search (Platform)
Starter
Perfect for individual researchers
Starter
Perfect for individual researchers
Starter
Perfect for individual researchers
Real-World Applications
See how teams across biotech and life sciences use data packages to accelerate discovery and ensure reproducibility.
Genomics Research
Package sequencing data with sample metadata for reproducible analysis pipelines
Genomics Research
Package sequencing data with sample metadata for reproducible analysis pipelines
Genomics Research
Package sequencing data with sample metadata for reproducible analysis pipelines
Genomics Research
Package sequencing data with sample metadata for reproducible analysis pipelines
From Expendable Resource to Reusable Asset
AI and Machine Learning are creating new opportunities to answer much broader questions than lab data was originally intended for. To be competitive in biotech, leveraging data beyond its original scope is no longer just nice to have—it's an imperative.
AI/ML Ready
Data packages provide the structured, contextualized data that AI models need to excel
AI/ML Ready
Data packages provide the structured, contextualized data that AI models need to excel
AI/ML Ready
Data packages provide the structured, contextualized data that AI models need to excel

Why build data packages?
Data is more powerful with context
Data without context is not reusable
LINKED DATA IS REUSABLE DATA


What are Data Packages
Not only do data packages keep track of their own versions, they keep track of the versions of underlying data as well, giving teams the ability to review every version of every document contained in a data package. Teams can grab a version of a package and run it through pipelines to test repeatability, or can link a historical package to their colleague to ensure they’re iterating on the same data.
Data Packages Defined

Data with context
Data packages offer a streamlined approach to data management by storing both data and metadata together in object storage. This integrated method contrasts with traditional practices where metadata and versioning information are stored separately in databases, while the actual data resides in object stores. By consolidating data and its contextual information within the same storage unit, data packages enhance the integrity and coherence of data management, ensuring that context and content are always aligned and readily accessible.


Deeply Versioned
One of the significant advantages of data packages is their support for deep versioning, which is facilitated by the use of SHA-256 checksums for each revision. This cryptographic hash function ensures the integrity of data by providing a unique identifier for every version, allowing users to track changes, verify data accuracy, and revert to previous versions if necessary. This robust versioning capability enhances data reliability and facilitates meticulous data management across different iterations.
Flexible Metadata
Data packages excel in accommodating diverse metadata needs through their flexible metadata schema. Unlike rigid systems that enforce a single, uniform metadata schema across all datasets, data packages allow teams to capture and manage metadata in a way that best suits their specific requirements. This flexibility ensures that relevant details are preserved and easily accessible without the need for a one-size-fits-all approach, thus supporting a wide range of data use cases and applications.


Interoperable Data
The interoperability of data packages is a crucial feature, as it allows seamless integration and sharing of data across different platforms. By storing data in an open-source, customer-owned data storage system, data packages enable the attachment of data from various platforms (Platform A and Platform B) while maintaining compatibility. This open approach ensures that data can be effectively utilized across diverse systems and environments, fostering greater collaboration and data exchange.
Easily Accessed
Data packages are designed to be easily accessible, with features like SQL querying and faceted search to enhance data retrieval. These functionalities allow teams to efficiently locate and extract the data packages they need, regardless of the cloud environment they are using. The ability to perform advanced searches and queries ensures that users can quickly find relevant datasets and integrate them into their workflows, significantly improving data accessibility and usability.


Data packages aren't theoretical
Build data packages with Quilt TODAY


![QB-logo-h-fullcolor 1 [Vectorized] QB-logo-h-fullcolor 1 [Vectorized]](https://www.quilt.bio/hs-fs/hubfs/QB-logo-h-fullcolor%201%20%5BVectorized%5D.png?width=1440&height=301&name=QB-logo-h-fullcolor%201%20%5BVectorized%5D.png)

