What metadata does Tarsnap store?
Normal user data is split into chunks which are (on average) ~64 kB before compression. The actual length of chunks is variable so that if a file is modified in the middle, the chunk boundaries will re-align quickly (on average after 1.6 chunks) so that existing chunks can be re-used.
Tarsnap stores the following metadata:
- A 512-byte tar header for each file. After these headers are passed through the deduplication and compression layers, each header is ~50-100 bytes.
- Lists of ~1600 data blocks (i.e. which blocks belong to which files). These lists are also deduplicated and compressed.
- A list of the lists of blocks, which is not deduplicated.
- The archive name, time, command-line which created the archive, etc. This is approximately ~2 kB before compression.
Privacy note
All of the above data is
encrypted
with
your private keys to prevent anybody from infringing on your privacy
— even the names of archives are protected.
Tarsnap keeps a local cache which keeps track of which blocks it has previously uploaded; querying the server for every block would be much too slow, and would reveal more information to the server about how archives are changing over time. The cache directory is ~0.5% of the size of the data you store.