Data Availability
The guarantee that transaction data is published and accessible so anyone can verify state transitions and detect fraud.
Key Takeaways
- Data availability is the guarantee that block data has been published so any participant can download it, verify rollup state transitions, and submit fraud proofs if needed.
- Ethereum introduced blob transactions (EIP-4844) in March 2024 to provide cheaper DA for Layer 2 rollups, while dedicated DA layers like Celestia, EigenDA, and Avail offer alternative tradeoffs between cost, throughput, and trust assumptions.
- For Bitcoin L2s, data availability is more constrained: solutions range from embedding data in Taproot witness fields to using external DA layers, each with distinct security and cost tradeoffs.
What Is Data Availability?
Data availability (DA) refers to the confidence that when a block producer proposes a new block, the underlying transaction data has actually been published and is accessible to all network participants. A node should never have to trust a block producer's claim that data exists: it must be able to verify this independently.
On a monolithic blockchain like Bitcoin or Ethereum L1, full nodes achieve this by downloading complete block data and re-executing every transaction. The challenge arises with light clients (which do not download full blocks) and with Layer 2 rollups, which execute transactions off-chain and post only compressed data back to the base layer. If a rollup sequencer withholds that data, the entire security model breaks down.
It is worth distinguishing data availability from data retrievability. DA concerns whether data was published at the time a block was produced: enough to verify correctness and submit challenges. Retrievability concerns long-term historical access. Ethereum's blob data, for example, is available for roughly 18 days before being pruned: sufficient for rollup security, but not permanent archival.
How It Works
The data availability problem is straightforward to state: how can network participants verify that all data in a proposed block was actually published, without requiring every node to download every byte? The solutions depend on where and how the data is stored.
On-Chain Data Availability
The most direct approach is posting transaction data directly on the base layer. On Ethereum, rollups historically used calldata for this purpose, paying roughly 16 gas per non-zero byte. This provided the strongest security guarantee (data inherits Ethereum's full consensus) but was expensive: rollup transaction fees often exceeded $0.20 per transaction.
EIP-4844 (Proto-Danksharding), activated on March 13, 2024 as part of Ethereum's Dencun upgrade, introduced a new transaction type carrying data "blobs." Each blob holds 128 KB of data (4,096 field elements of 32 bytes each). At launch, Ethereum targeted 3 blobs per block with a maximum of 6, providing up to 768 KB of DA per block. The Pectra upgrade (May 2025) raised the target to 6 blobs with a maximum of 9, and subsequent upgrades under PeerDAS continue scaling further.
Blobs differ from calldata in three important ways: they are not accessible to the EVM, they are temporary (pruned after roughly 18 days or 4,096 epochs), and they have their own independent fee market modeled on EIP-1559. This separation means blob fees do not compete with regular execution gas. After EIP-4844 went live, median fees on rollups like Arbitrum, Optimism, and Base dropped from tens of cents to fractions of a cent: a reduction of roughly 90-100x.
Data Availability Sampling
Data availability sampling (DAS) is the key innovation for scaling DA without proportionally increasing node requirements. Instead of downloading full blocks, nodes sample small random pieces and use statistical guarantees to verify availability.
The process relies on erasure coding (specifically Reed-Solomon encoding). Block data is expanded using a polynomial: the original data is evaluated at additional points, typically doubling its size with redundant parity information. A critical property emerges: if any original data is missing, at least 50% of the expanded dataset will be absent. This makes withholding even a single transaction detectable.
- Block data is erasure-coded to create an extended dataset with redundancy
- Light nodes download small, randomly selected chunks of the extended data
- Each chunk comes with a cryptographic proof (Merkle proof or KZG commitment) confirming its validity
- If all sampled chunks are present and valid, the node concludes with high statistical confidence that the full data is available
Downloading just 100 random chunks yields a false-positive probability of approximately 10-30: virtually impossible for a withholding attack to go undetected. More light nodes sampling actually increases overall network security, since the more participants who sample, the more of the extended data gets checked.
DAS is not yet live on Ethereum L1 (it is planned for full Danksharding), but Celestia and Avail already implement it in production.
Full Danksharding
Full Danksharding is Ethereum's long-term plan to scale blob capacity dramatically. Named after researcher Dankrad Feist, it would expand to 64 blobs per block (roughly 8 MB of data per block) and require DAS so validators can verify availability by sampling rather than downloading all blobs. It also depends on proposer-builder separation (PBS), where specialized block builders handle the computationally intensive task of assembling blob-heavy blocks. Full Danksharding remains several years away on Ethereum's roadmap.
Off-Chain DA Solutions
Dedicated DA layers offer an alternative to posting data on Ethereum L1. These systems are purpose-built for data publication and verification, trading some trust assumptions for significantly lower costs and higher throughput.
Celestia
Celestia, launched on October 31, 2023, is the first modular blockchain designed exclusively for data availability and consensus. It separates execution from DA, allowing rollups to use Celestia purely for ordering and storing transaction data.
Celestia arranges block data into a square matrix and applies 2D Reed-Solomon erasure coding, extending it into a larger square with parity data. Each rollup gets its own namespace via Namespaced Merkle Trees (NMTs), meaning a rollup only downloads its own data rather than the entire block. Light nodes perform DAS by randomly sampling coordinates from the extended data square: after 7 rounds of sampling, they achieve 99% confidence that no data has been withheld.
Celestia produces blocks every 6 seconds and supports blocks up to 128 MB after its Matcha upgrade, with a roadmap targeting 1 GB blocks.
EigenDA
EigenDA, launched in April 2024, takes a different approach. Rather than running a separate blockchain, it operates as an Actively Validated Service (AVS) on EigenLayer, secured by restaked ETH from Ethereum validators. Operators validate, store, and serve data chunks in exchange for service payments. If they fail to store data properly, their restaked ETH faces slashing.
EigenDA uses erasure coding with KZG polynomial commitments: each operator stores only a fraction of the full blob data, achieving O(1/n) storage per node. EigenDA V2 (July 2025) claims 100 MB/s throughput, but it does not support DAS. End users cannot independently verify data availability: security relies on the economic guarantees of restaked collateral and committee attestation rather than public verifiability.
Avail
Avail, originally developed within Polygon Labs and launched independently on July 23, 2024, combines KZG commitments with data availability sampling. This combination provides instant verification (no fraud proof challenge period) while still allowing light clients to verify DA by sampling. Avail light clients can run on phones and resource-constrained devices, adding resilience even during full node outages.
Data Availability for Bitcoin L2s
Bitcoin was not designed with rollup DA in mind. It lacks native blob transactions and has a more constrained scripting system than Ethereum. Bitcoin L2s therefore face unique challenges in ensuring data availability.
Embedding Data on Bitcoin
There are two primary methods for posting data directly to the Bitcoin blockchain:
- OP_RETURN outputs: historically limited to 80 bytes of arbitrary data per output, far too small for meaningful rollup DA. Bitcoin Core 30 (2025) raised the default data carrier size limit to 100,000 bytes, though the practical constraint remains the 100,000 vbyte transaction size limit.
- Taproot witness data: the Taproot upgrade enabled embedding arbitrary data in the witness portion of transactions using inscription envelopes. Witness data benefits from the SegWit discount (1/4 the weight of non-witness data) and can theoretically use up to the 4 MB block weight limit.
Both methods are significantly more constrained than Ethereum blobs. A single Ethereum blob (128 KB) holds more data than most practical Bitcoin on-chain DA transactions, and Ethereum targets multiple blobs per block.
Hybrid and External Approaches
Many Bitcoin L2s adopt hybrid strategies: posting only state commitments or data hashes to Bitcoin for finality and anchoring, while storing full transaction data on an external DA layer such as Celestia. This sacrifices some of Bitcoin's security guarantees in exchange for practical throughput. Projects like Spark use their own off-chain data structures to maintain state while anchoring to Bitcoin for settlement security. For a deeper comparison, see the Bitcoin Layer 2 comparison and Bitcoin second-layer scaling landscape research articles.
Cost Implications
DA cost is typically the largest component of rollup operating expenses. Where and how data is posted directly determines end-user transaction fees.
| DA Method | Approximate Cost | Security Model |
|---|---|---|
| Ethereum calldata | Highest (16 gas/byte, competing with execution gas) | Full Ethereum consensus |
| Ethereum blobs (EIP-4844) | 90-100x cheaper than calldata | Ethereum consensus, separate fee market |
| Celestia | Significantly cheaper than Ethereum blobs | Own PoS chain (100 validators) |
| EigenDA | Low (restaking economic model) | Restaked ETH, committee attestation |
| Avail | Low | Own NPoS chain, KZG + DAS |
The tradeoff is consistent: lower DA costs come with weaker security assumptions. Ethereum blobs inherit the full security of 900,000+ validators. A dedicated DA layer like Celestia operates with 100 validators. EigenDA relies on economic slashing guarantees rather than public verifiability. Each rollup must decide where on this spectrum its users' needs fall.
Use Cases
- Optimistic rollups: DA ensures that anyone can reconstruct the rollup state and submit fraud proofs during the challenge period. Without it, a malicious sequencer could publish invalid state roots that no one can challenge.
- ZK-rollups: even with validity proofs guaranteeing correct state transitions, users still need access to the underlying data to determine their balances and construct withdrawal proofs.
- Validiums: rollup variants that post only state commitments on-chain and store data off-chain (using a DA committee or external DA layer), trading security for lower costs.
- Sovereign rollups: chains that use a DA layer like Celestia for data ordering and availability but handle their own settlement and execution, without relying on a separate settlement layer.
Risks and Considerations
Withholding Attacks
The core risk: a block producer or rollup sequencer publishes a valid-looking block header but withholds the underlying transaction data. Without the data, verifiers cannot confirm the block's validity or generate fraud proofs. DAS mitigates this by making withholding statistically detectable, but only on networks that implement it.
Trust Assumptions
Off-chain DA solutions introduce additional trust assumptions beyond the base layer. A rollup using Celestia for DA depends on Celestia's validator set (currently 100 validators) remaining honest. EigenDA depends on restaked economic guarantees and a centralized disperser operated by EigenLabs. These are meaningful tradeoffs that users and rollup developers should evaluate carefully.
Temporary Availability
Modern DA solutions prioritize availability at block production time, not permanent storage. Ethereum blobs are pruned after roughly 18 days. Celestia data is available for a limited window. If historical data is needed beyond this period (for example, to sync a new node), it must come from alternative sources: archive nodes, rollup operators, or decentralized storage networks.
Cost Volatility
Blob fees on Ethereum follow a dynamic pricing model. During periods of high demand, blob fees can spike significantly as rollups compete for limited blob space. The fee increases exponentially when blob usage exceeds the target, meaning costs can be unpredictable for rollup operators and their users.
This glossary entry is for informational purposes only and does not constitute financial or investment advice. Always do your own research before using any protocol or technology.