Sharding + Data Availability Sampling: A Scalable Solution for Ethereum 2.0

·

Introduction

Sharding is one of the most significant innovations in Ethereum 2.0 (eth2), alongside Proof of Stake (PoS). This proposal outlines a focused implementation called "data sharding", designed to store and verify the availability of approximately 250 kB of data per shard. By ensuring data availability, sharding provides a secure and high-throughput foundation for Layer 2 solutions like rollups.

To alleviate the burden of downloading all data, we combine two techniques:

  1. Randomly sampled committees for attestations.
  2. Data Availability Sampling (DAS) for lightweight verification.

Randomly Sampled Committees Explained

Imagine handling 16 MB of data per slot (eth2's initial capacity). We split this into 64 blobs, each 256 kB in size. With 6,400 validators in the PoS system, how do we verify the data without:

  1. Requiring everyone to download everything.
  2. Allowing attackers with few validators to compromise the system?

The Committee Approach

The Problem

An attacker controlling consecutive validators (e.g., 1971-2070) could dominate a single committee with just ~1.5% of total validators, enabling invalid blobs.

Solution: Random Sampling


Data Availability Sampling (DAS) Demystified

DAS flips the committee model: clients sample data within blobs instead of across blobs.

How It Works

Why It’s Secure


Erasure Coding: A Safety Net

To prevent partial data releases (50–99%), we use erasure coding:

Kate Commitments

Replace Merkle roots with polynomial commitments (e.g., Kate commitments) to prove correct evaluations without complex fraud proofs.


Committee vs. DAS: A Hybrid Approach

Why Committees Aren’t Enough

Why DAS Needs Committees


Data Availability’s Role in Ethereum

Key Reads

👉 Why BitTorrent/IPFS Fall Short

Critical Insight: BitTorrent can’t achieve consensus on data availability, leaving room for attacks.


P2P Layer Mechanics

Subnet Architecture

Blob Broadcast Process

  1. Head: Sent to the global subnet.
  2. Body: Sent to the relevant horizontal subnet.
  3. Sample Distribution: Peers propagate samples to vertical subnets.

Self-Healing Unpublished Blobs

  1. Reverse Distribution: Vertical → horizontal subnets.
  2. Reconstruction: With ≥50% samples, anyone rebuilds the blob.
  3. Redistribution: Push the reconstructed blob.

Beacon Chain Integration

Low Validator Counts

Below 262,144 validators? Rotate shard assignments to maintain committee sizes (e.g., 50 shards per slot).


Economic Design

Security Assumptions


FAQ

1. Can sharding add execution later?

Yes. This design is forward-compatible (e.g., via fraud proofs or SNARKs).

2. Why combine committees and DAS?

Committee redundancy mitigates risks while DAS scales efficiently.

3. How does erasure coding improve security?

It ensures clients can reconstruct full data if ≥50% is available, preventing partial-data attacks.

👉 Explore Ethereum 2.0’s Roadmap


Additional Resources

Disclaimer: ECN translations aim to bridge the language gap—always refer to original sources for authoritative content.