View Single Post
Originally Posted by ueila
I thought perhaps if I changed the file extention to .ofo I could back them up and then rename them on the server. Renaming surprised me by showing the .ofocus data file as a folder containing 336 .xml files.
This is probably a little technical, but for the curious: OmniFocus uses compressed XML transaction files to store its data, with a SQL cache for efficient access. (Each time you update the application, we rebuild the SQL cache to ensure that it's consistent with the latest schema.)

Originally Posted by ueila
I seem to remember another GTD app recently switched to SQL for storing data and the explanation for the change was potential corruption problems with .xml.

This creates several questions in my mind. Is OmniFocus data storage .xml AND do Omni Group feel it is a reliable way to store data AND is this only a temporary system in place for the BETA?
XML is an extensible text file format. The only way to corrupt XML is to write or read it incorrectly (the most common ways being to incorrectly format the values when writing or to incorrectly interpret those escaped values when reading) or to have the underlying disk storage go bad (which has nothing to do with the file format itself).

Structured text formats are the basis of most modern Internet protocols (such as web pages), and we use them as the primary document format for all our applications. Because they're text, technically oriented users can open them up and see what's changing and whether the values being written are correct (while XML can't be corrupt if you're writing and reading it correctly, the data you're trying to store in XML might not be correct), and they are designed to be very flexible, making them easy to extend as needs change. (They also have the very nice property of being easy to read on all systems, so your data is never held hostage by a particular application.)

Before we got the XML format in place, we were using a CoreData SQL database ourselves, but that's a very bad idea for primary data storage in an evolving application (and what application doesn't evolve?). Migrating from SQL to an XML file format was our biggest priority before releasing the application to the pre-beta audience of users (as we mentioned on our blog), because we wanted to be sure we had a file format that would continue to work with future versions of OmniFocus without losing any data.

Apple's Mail application uses a CoreData SQL database exactly as we do: the important data is actually stored in extensible text files, the SQL database is just used for efficient access to the data in those files. This means that if something goes awry with the SQL database (or if you need to change your SQL structure), you can just throw out the SQL and rebuild it from the text files.

Why is this a big deal? I apologize if I'm getting too technical, but all SQL databases have a fixed schema structure (SQL stands for "Structured Query Language"). As long as you keep the same structure you're fine, but as soon as you need to change that structure (e.g. to add support for repeating tasks) you're faced with a data migration problem, where you need to read all of the data in the old format and write it out in the new format. This isn't so bad when you control the database and the software and can make sure you migrate the database to the new format immediately as you update the software, but is very awkward when you don't know how old a database might be. (You end up having to write code to migrate from every possible earlier version of the database to the latest one, and all of that code has the potential for bugs that corrupt the data—possibly permanently, unless you're very careful about backups.)

(Sorry if this got too long-winded. For most of our first decade, Omni's business was based on building custom SQL database systems for large corporations (William Morris, AT&T Wireless, Standard & Poor's, etc.). During those years we also did quite a bit of work with extensible text formats, building one of the first web browsers in 1994 (the web is based around an extensible text format, or we would have a lot of trouble viewing those 1994 web pages today) and updating Lighthouse Design's applications to use extensible text formats rather than proprietary binary formats—so we have quite a bit of expertise in this area.)

Last edited by Ken Case; 2007-05-20 at 10:15 AM.. Reason: Fixed some punctuation, added some links