View Single Post
Have you considered using a content-adressed scheme where attachment objects are referenced by e.g. a SHA1 hash of the binary object (e.g. like the git scheme http://progit.org/book/ch9-0.html).

In a content adressed scheme many of the issues of syncing directories go away (including n-level deep directories).

It might look something like this:
- The top level XML document refers to each attachment by the SHA1 hash of its content.
- The .oo3 package directory has a subdirectory called "attachments".
- Each attachment is stored in a file where the file name is the SHA1 hash of the file content.
- If a multi-level deep structure is required, some of the "attachments" could be XML files that refer to other attachments and so-on.

The following properties arise:
- It is always ok to blindly sync the contents of the "attachments" subdirectory in either direction. There is no risk of overwriting anything because two files with the same name are guaranteed to have the same content.
- The top level document can be synced by whatever single-file atomic syncing mechanism is available.
- If the top-level document references an attachment object that is not available on the local device, it can be simply requested from the cloud by name (SHA1 hash). The GUI can either show "busy" while the download happens, or can make sure all the attachments are downloaded before activating the newly synced top-level document (e.g. before copying it form a tmp filename to the "proper" file name).
- Every time an attachment is "changed" it gets a new filename.
- Garbage (unreferenced) objects will accumulate, but it is trivial to purge them from time to time.

A content-addressed scheme has other potential benefits in a cloud-storage system e.g.:
- A single cloud-based object repository could be shared across e.g. all omni apps for a particular user. If the user has the same files attached to multiple documents, no extra storage space is used and syncing is faster. Imagine a photo that is originally snapped in the ofocus iPhone app, then sits in an outline for a while, then finds its way into a graffle document, then the graffle document gets placed in some higher level outline etc. The .JPG would only be stored and synced once.
- Peer-peer syncing is trivial. If a device needs an object that it does not have in its local repository, it can retrieve it from any peer (even untrusted peers because if the SHA1 matches, the object is good).
...

Sam