The Omni Group
These forums are now read-only. Please visit our new forums to participate in discussion. A new account will be required to post in the new forums. For more info on the switch, see this post. Thank you!

Go Back   The Omni Group Forums > OmniWeb > OmniWeb Feature Requests
FAQ Members List Calendar Today's Posts

 
Saving complete web pages (with images) as files, not archive Thread Tools Search this Thread Display Modes
Please add the possibility to save the complete page (with images) as separate files (html file + images etc., maybe with a subfolder) like Firefox, Opera and others do it (by adjusting the HTML code that points to the images etc.). No PDF, no web archive, just something that's compatible with every browser.

See the thread http://forums.omnigroup.com/showthread.php?t=2275.

Tim B.
 
I just tried it in Firefox, it doesn't work. If the page is very simple, I'm sure it probably works. I rarely says absolutes, but I would say it's impossible for such a feature to be reliable.
 
Why should it be so difficult? You only have to recode all tags that point to stuff used in the page so that they reflect the relative path to the file on the hd. And, in order to be sure that links will still work, recode all links that point away from the page from relative to absolute. That's all—or am I missing something?

The only difficulties I can imagine are to decide what to call the html file if the source was dynamically generated, links might not work anymore if they contained session information, and problems with flash or other content that will run in the browser that might load other stuff when it's started.

But these are problems relevant to the generation of .webarchives, too.

Remember, we're talking about a single web page here, not about some part of the file tree of an entire website.

wget (commandline websucking tool) has been able to do this since I used it for the first time, which was more than 5 years ago.
 
Quote:
Originally Posted by zottel
Why should it be so difficult? You only have to recode all tags that point to stuff used in the page so that they reflect the relative path to the file on the hd. And, in order to be sure that links will still work, recode all links that point away from the page from relative to absolute. That's all—or am I missing something?
That's all? Sure, but that's a lot to ask.

Quote:
The only difficulties I can imagine are to decide what to call the html file if the source was dynamically generated, links might not work anymore if they contained session information, and problems with flash or other content that will run in the browser that might load other stuff when it's started.
And a lot of sites and certainly most of the popular ones will all suffer from those problems. If the site has Flash and it references any files or links to any files, you can pretty much bet it will break. A lot of JavaScript and CSS also breaks in my tests saving from Firefox 2 and IE7. This typically results in the page being badly broken.

Quote:
But these are problems relevant to the generation of .webarchives, too.
In my tests, that's not the case. Sites that broke when saving as completed HTML from IE7 and FF2 did not break when saved as a webarchive.

Quote:
Remember, we're talking about a single web page here, not about some part of the file tree of an entire website.
I must be completely missing what you're trying to say with that. Saving the source would be talking about a single web page, but saving it as a "complete" page is most certainly trying to save a part of the file tree of an entire site.

Quote:
wget (commandline websucking tool) has been able to do this since I used it for the first time, which was more than 5 years ago.
I haven't used that, but I would be seriously surprised if it didn't suffer from the same issues that IE7 and FF2 do.
 
I can't say that I had many problems with saving pages that way (but, well, this has been on Windows for some years now). Some occassional glitches (very rare), but one browser or the other would always save that specific page completely. The only thing slightly damaged might have been the layout of the page, but I can live with that. I religiously keep my notes in plain-text files and my huge archive of web pages in highly compatible single html files together with their adjacent files.

And cf. archives: There's not so much difference between a folder structure and the internal structure of an archive, or am I wrong with that?

TB
 
This has probably worked well in the past, but as newer techniques get used with sites, it's going to become less and less reliable.
 
Quote:
Originally Posted by timb
And cf. archives: There's not so much difference between a folder structure and the internal structure of an archive, or am I wrong with that?
As I said in the posting above, I guess that .webarchives are in fact some representation of the internal model of the browser. That means that when a .webarchive is loaded, the browser will be put into exactly the same state it was in when you were actually viewing the page. This way, several problems can be avoided. Above all, the browser is practically in the same server directory. So all relative links, be it in images or links or Javascripts or Flash animations, will still point to the correct destination without changing anything. Additionally, any dynamic content, even if it's ajaxly dynamic, ;-) will still have just the same representation as it had when you were actually viewing the page. It would be extremely difficult, if not impossible, to get this by translating that stuff to actual files and still be able to interact with it when you view it again (like moving a map on maps.google.com).

Edit: Interactivity will also be broken with .webarchives, if the page has changed meanwhile, of course. If Google decides to use some other Javascript model for moving maps, your old .webarchive will still show the same as before, but you won't be able to move the map anymore.

Last edited by zottel; 2006-11-30 at 05:07 PM..
 
 




Similar Threads
Thread Thread Starter Forum Replies Last Post
Saving images results in php file Handycam OmniWeb Bug Reports 2 2009-08-05 04:55 AM
Saving files katherine OmniFocus 1 for Mac 1 2008-08-31 12:02 PM
Remember last selected folder while saving web pages Tiggar OmniWeb General 1 2007-12-14 09:27 AM
Saving complete web pages (with images)? timb OmniWeb General 12 2006-11-29 10:41 AM
Local pages, source editor/viewer, and saving joragan OmniWeb Bug Reports 2 2006-04-26 08:30 AM


All times are GMT -8. The time now is 10:05 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.