The Omni Group Forums

The Omni Group Forums (http://forums.omnigroup.com/index.php)
-   OmniFocus Extras (http://forums.omnigroup.com/forumdisplay.php?f=44)
-   -   Finding file links that won't resolve (http://forums.omnigroup.com/showthread.php?t=22908)

mlevin777 2011-12-15 05:09 PM

Finding file links that won't resolve
 
Can anyone make a script to identify those actions which contain attachment links to which the file can't be found? I need to scan my large database and (periodically, since this keeps happening) find those links which are "dead" now.

here's hoping!

thanks,

Mike

whpalmer4 2011-12-15 08:22 PM

Mike,

I don't quite understand how the links are broken, and what distinguishes a broken link from a working one. OF stores an attachment link as a "file://path" URL, and it sounds from your other posts like the file is still there, but when OF hands off the URL to be opened, an error occurs. Complicating matters a bit for the would-be script author is that there doesn't appear to be any way to see which notes have attachments, much less get at them, except by looking directly at the database files. Assuming you can generate that list, and test them all to see if they can be opened, what happens when you find one that fails? To me, this smells like a problem with Launch Services. You might try rebuilding the Launch Services database as described [URL="http://www.thexlab.com/faqs/resetlaunchservices.html"]here[/URL] the next time you see this (or now, and see if you ever see it again).

mlevin777 2011-12-16 12:24 AM

> I don't quite understand how the links are broken, and what distinguishes
> a broken link from a working one.

I don't know how they get broken either but an OF Ninja confirmed the problem... What distinguishes it is that a working one, when I click on it, opens the file (for me, usually a PDF file living in my Dropbox folder and opening in Adobe Acrobat when I click on the attachment). A broken link works in one of 2 ways: either it says "original cannot be found" and asks me to locate it, or it says the same thing but then the file selection menu that comes up actually has the right folder opened and the right file highlighted (although I still have to click on it before it will open). So I think there are 2 levels of broken links somehow.

> OF stores an attachment link as a "file://path" URL, and it sounds from
> your other posts like the file is still there, but when OF hands off the URL
> to be opened, an error occurs. Complicating matters a bit for the
> would-be script author is that there doesn't appear to be any way to see
> which notes have attachments, much less get at them, except by looking
> directly at the database files. Assuming you can generate that list,

I can - an OF Ninja actually sent me a 1-line zsh code that scans the database and generates a list of all links. I guess I could just use zsh to test for the presence of each one. But they are all there, so it wouldn't help - what I need is a script from within OF to tell me which files it won't be able to open even though the path is right and the file exists...

> and test them all to see if they can be opened, what happens when you
> find one that fails?

I'll go and re-link it by hand, so that I don't get the nasty surprise of not being able to open it when I really need it (on the road).

> To me, this smells like a problem with Launch Services. You might try
> rebuilding the Launch Services database as described > [URL="http://www.thexlab.com/faqs/resetlaunchservices.html"]here[/URL]
> the next time you see this (or now, and see if you ever see it again).

ok I'll try (I will see it again - it's perfectly reproducible).

thanks

Mike

whpalmer4 2011-12-16 10:49 AM

[QUOTE=mlevin777;105199]> I don't quite understand how the links are broken, and what distinguishes
> a broken link from a working one.

I don't know how they get broken either but an OF Ninja confirmed the problem... What distinguishes it is that a working one, when I click on it, opens the file (for me, usually a PDF file living in my Dropbox folder and opening in Adobe Acrobat when I click on the attachment). A broken link works in one of 2 ways: either it says "original cannot be found" and asks me to locate it, or it says the same thing but then the file selection menu that comes up actually has the right folder opened and the right file highlighted (although I still have to click on it before it will open). So I think there are 2 levels of broken links somehow.
[/quote]
Right, what I meant by that was what is the essential difference between a link that points at the right file, and works, and a link that points at the right file, but doesn't? When you go and locate the file, what is being done to fix it so it works the next time?

Can you post the zsh snippet? Perhaps I've got some experimental data of my own to work with, and just haven't noticed :-)

[quote]> OF stores an attachment link as a "file://path" URL, and it sounds from
> your other posts like the file is still there, but when OF hands off the URL
> to be opened, an error occurs. Complicating matters a bit for the
> would-be script author is that there doesn't appear to be any way to see
> which notes have attachments, much less get at them, except by looking
> directly at the database files. Assuming you can generate that list,

I can - an OF Ninja actually sent me a 1-line zsh code that scans the database and generates a list of all links. I guess I could just use zsh to test for the presence of each one. But they are all there, so it wouldn't help - what I need is a script from within OF to tell me which files it won't be able to open even though the path is right and the file exists...
[/quote]
The experiment I want to try is to use the open command in the Terminal with the file:// URL stashed in OF on one of the "bad" files. My suspicion is that it will fail. What I can't explain is what locating the file does to make it not fail the next time.
[quote]
> and test them all to see if they can be opened, what happens when you
> find one that fails?

I'll go and re-link it by hand, so that I don't get the nasty surprise of not being able to open it when I really need it (on the road).

> To me, this smells like a problem with Launch Services. You might try
> rebuilding the Launch Services database as described > [URL="http://www.thexlab.com/faqs/resetlaunchservices.html"]here[/URL]
> the next time you see this (or now, and see if you ever see it again).

ok I'll try (I will see it again - it's perfectly reproducible).
[/QUOTE]
Maybe, maybe not :-)

Ken Case 2011-12-16 12:25 PM

[QUOTE=mlevin777;105199]I can - an OF Ninja actually sent me a 1-line zsh code that scans the database and generates a list of all links. I guess I could just use zsh to test for the presence of each one.[/QUOTE]

Here's a zsh one-liner which tests to see which of your linked OmniFocus attachments still exist at the same path:

[CODE]for file in "$HOME/Library/Application Support/OmniFocus/OmniFocus.ofocus"/*.zip; do; unzip -p $file contents.xml | xmllint -format - | grep 'cell href='; done | sed 's/.*<cell href="//' | sed 's/".*//' | while read url; do; if curl -s -I $url > /dev/null; then; echo Found $url; else; echo Lost $url; fi; done[/CODE]

If you want to just see the lost URLs, you could do this:

[CODE]for file in "$HOME/Library/Application Support/OmniFocus/OmniFocus.ofocus"/*.zip; do; unzip -p $file contents.xml | xmllint -format - | grep 'cell href='; done | sed 's/.*<cell href="//' | sed 's/".*//' | while read url; do; if ! curl -s -I $url > /dev/null; then; echo $url; fi; done[/CODE]

Hope this helps!

RobTrew 2011-12-16 03:28 PM

That's a brilliant piece of work.

(And invaluable - I found some missing links as well ... )

mlevin777 2011-12-16 05:14 PM

[QUOTE=Ken Case;105215]Here's a zsh one-liner which tests to see which of your linked OmniFocus attachments still exist at the same path:

[CODE]for file in "$HOME/Library/Application Support/OmniFocus/OmniFocus.ofocus"/*.zip; do; unzip -p $file contents.xml | xmllint -format - | grep 'cell href='; done | sed 's/.*<cell href="//' | sed 's/".*//' | while read url; do; if curl -s -I $url > /dev/null; then; echo Found $url; else; echo Lost $url; fi; done[/CODE]

If you want to just see the lost URLs, you could do this:

[CODE]for file in "$HOME/Library/Application Support/OmniFocus/OmniFocus.ofocus"/*.zip; do; unzip -p $file contents.xml | xmllint -format - | grep 'cell href='; done | sed 's/.*<cell href="//' | sed 's/".*//' | while read url; do; if ! curl -s -I $url > /dev/null; then; echo $url; fi; done[/CODE]

Hope this helps![/QUOTE]

hmmm. The first one gives me

Lost file://localhost/Users/mlevin/Science/Work%20stuff/Research%20Projects/GOF%20phenotypes%20and%20Vmem%20transduction/Muscimol-gated%20(GABA-Cl)%20stuff/
Lost file://localhost/Users/mlevin/Science/Work%20stuff/Research%20Projects/GOF%20phenotypes%20and%20Vmem%20transduction/Muscimol-gated%20(GABA-Cl)%20stuff/
Lost <lit>[for file in
Lost file://localhost/Users/mlevin/Desktop/TO%20DO/Key%20Levin%20Lab%20papers%20to%20read/
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit> for file in
Lost <lit> for file in
Lost <lit> for file in
Lost <lit> for file in
Lost <lit> for file in
Lost <lit> for file in
Lost <lit> for file in

so there's something weird at the end. Also the second one gives me the same thing. Is there any easy way to convert the file URLs to a pathname (get rid of the %20 etc.) so I can check easily if the file actually exists?

thanks

Mike

mlevin777 2011-12-16 05:17 PM

> [QUOTE=whpalmer4;105212]Right, what I meant by that was what is the
> essential difference between a link that points at the right file, and works,
> and a link that points at the right file, but doesn't? When you go and
> locate the file, what is being done to fix it so it works the next time?

good question! I just hit "enter" (to select the file it already chose).

> Can you post the zsh snippet? Perhaps I've got some experimental data
> of my own to work with, and just haven't noticed :-)

for file in "$HOME/Library/Application Support/OmniFocus/OmniFocus.ofocus"/*.zip; do; unzip -p $file contents.xml | xmllint -format - | grep 'cell href='; done | sed 's/.*<cell href="//' | sed 's/".*//'

> The experiment I want to try is to use the open command in the Terminal
> with the file:// URL stashed in OF on one of the "bad" files. My suspicion
> is that it will fail. What I can't explain is what locating the file does to
> make it not fail the next time.

yeah, it's very weird - the file URL looks to me like it should definitely work...

whpalmer4 2011-12-16 06:27 PM

[QUOTE=mlevin777;105220]Is there any easy way to convert the file URLs to a pathname (get rid of the %20 etc.) so I can check easily if the file actually exists?
[/QUOTE]
Well, in the Terminal, if you just

open <URL>

it will either successfully open the file, or generate an error message which will have the pathname in the format you expect, as seen here:
[code]

open file://localhost/Volumes/Macintosh%20HD/Users/oftest/Desktop/OmniGraffle%20tutorials/part2.mov
The file /Volumes/Macintosh HD/Users/oftest/Desktop/OmniGraffle tutorials/part2.mov does not exist.

[/code]

While playing around with some of my dangling links (many caused by linking to files on the desktop, which subsequently got put away), I noticed that after I fixed the first one, the "locate" file browser would helpfully select the right file. Further investigation shows it is just caching the last directory it used and searching for a match there.

Ken Case 2011-12-18 03:01 PM

1 Attachment(s)
[QUOTE=mlevin777;105220]hmmm. The first one gives me

Lost file://localhost/Users/mlevin/Science/Work%20stuff/Research%20Projects/GOF%20phenotypes%20and%20Vmem%20transduction/Muscimol-gated%20(GABA-Cl)%20stuff/
Lost file://localhost/Users/mlevin/Science/Work%20stuff/Research%20Projects/GOF%20phenotypes%20and%20Vmem%20transduction/Muscimol-gated%20(GABA-Cl)%20stuff/
Lost <lit>[for file in
Lost file://localhost/Users/mlevin/Desktop/TO%20DO/Key%20Levin%20Lab%20papers%20to%20read/
Lost <lit>[for file in
Lost <lit>[for file in
Lost <lit>[for file in

[/QUOTE]

Hmm… maybe there was some problem with the copy and paste of the line in my earlier post? Try this zip archive of the script instead. (This version of the script is split across multiple lines for readability, and also has a small improvement: it sorts and "uniques" the referenced files, so links to the same file in multiple transactions are only reported once rather than every time they occur.)


All times are GMT -8. The time now is 11:13 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.