Data Availability

Simple Definition of Data Availability
Fraud Proofs
  • Data availability refers to the ability for transaction data to be made available for nodes to download.
  • Blockchain Nodes
  • "Data availability" and the "data availability problem" are terms used to refer to a specific problem faced in various blockchain scaling strategies.
  • The data availability problem asks: how can nodes be sure that when a new block is produced, that all of the data in that block was actually published to the network?
    • The dilemma is that if a block producer doesn't release all of the data in a block, no one could detect if there is a malicious transaction hidden within that block.
    • Block Producer
  • For more information about data availability, this post by Celestia Labs co-founder Mustafa Al-Bassam is a good place to start.
Longer Definition of Data Availability
  • Data availability refers to the availability of transactions in a block that is appended to the tip of the chain. During consensus, validators download the block to verify its availability. If the block contains any transactions that are withheld by a validator, the block is unavailable and will be rejected as invalid.
  • The condition of whether or not transaction data was made available for nodes to download, when a block was proposed.
    • Verifying data availability is the only way to prevent data withholding, a devastating attack that breaks the fundamental security of any blockchain. In the event that a block is proposed where the underlying data is unavailable, the rest of the network won’t be able to confirm the validity of the transactions in the block, or won’t be able to perform a state transition using the update from the proposed block.
  • In traditional blockchains, data availability is verified by requiring full nodes to download all the block data. This approach does not scale, hence the need for specialized schemes such as data availability sampling which allow nodes to verify data availability without downloading the entire block.
Data Availibility vs Data Retrievability
  • Data availability is only concerned about the availability of a block when it is being proposed by a validator. Once the block has completed the consensus processes, is appended to the tip of the chain, and has propagated throughout the network, then the ability to download transactions from that block is what we call retrievability.
    • This distinction is important because retrievability is a different problem from availability.