Merkle DAGs inherit the distributability of CIDs. Using content-addressing for DAGs has several interesting consequences for their distribution. The first, of course, is that anybody who has a DAG is capable of acting as a provider for that DAG. The second is that when we’re retrieving data encoded as DAG, like a directory of files, we can leverage this fact to retrieve all of a node’s children in parallel, potentially from a number of different providers! The third is that file servers are not limited to centralized data centers, giving our data greater reach. Finally, because each node in a DAG has its own CID, the DAG it represents can be shared and retrieved independently of any DAG it is itself embedded in.
As an example, consider the distribution of a large, popular, scientific dataset. Today, on the location-addressed web:
Merkle DAGs help us alleviate all of these problems. By distributing the dataset as a content-addressed DAG:
All of this works to promote scalable and redundant access to this important data.
Feeling stuck? We'd love to hear what's confusing so we can improve this lesson. Please share your questions and feedback.