On Using A Blockchain For Archiving Data
One of the main advantages of using a blockchain is the redundancy of the storage due to the distribution of the ledger among many nodes. The consequence of that redundancy has some drawbacks, the main one being that storage is expensive. Per the question being asked, we will not discuss about that cost but instead we will focus on the impact of that distributed characteristic on the preservation of content, data integrity and access control.
Data added to a blockchain is added to all the full nodes on the network. That means that if any node(s) would disappear, many other copies of the data would still exist. Nodes would remain and keep running the blockchain as long as they have incentives to do so. Not all current blockchains will still exist in twenty years. Many of them, even the main ones, are still young and considered experiments. Some of those experiments will succeed, some will fail. But if a certain blockchain is chosen to archive data in it, and if that blockchain stops getting support and stops functioning at some point (or in case of a consensus attack on the blockchain with a rewriting of the chain), this fact will be known, and archived data could be moved to another still supported blockchain. Even if a blockchain is no longer supported, the integrity of the data that was put in it is still as strong as the work (POW) or stake (POS) that was dedicated to the chain until the point the data was included. Integrity of that data will not be in question and the data can safely be moved to another chain.
If the data is confidential and its access must be restricted, it cannot be included openly in a public blockchain. There are two possible ways to use the blockchain, each with their drawbacks. The first way would be to encrypt the data with a private key and to include the encrypted data in the blockchain. The data would then be stored in every full nodes, and only the users with the right key(s) could access the decrypted data. The problem here is that the required keys will only be kept by a limited number of users, and with no financial incentive to keep them. If the required keys were to be lost, access to the data would be lost forever. So this is not a very durable way to keep the data. The second way to use the blockchain would be to keep the data on private servers and only include the hash of the data in the blockchain. That would guarantee the integrity of the data, but would not offer any solution for guaranteeing the preservation of the data.
Although expensive, keeping a clear version of the data in a blockchain is a good way for archiving data, guaranteeing its preservation and integrity. But those advantages are possible because of the public characteristic of a blockchain. That prevents the flexibility of access control.