The Omni Group
These forums are now read-only. Please visit our new forums to participate in discussion. A new account will be required to post in the new forums. For more info on the switch, see this post. Thank you!

Go Back   The Omni Group Forums > OmniWeb > OmniWeb Feature Requests
FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
Saving complete web pages (with images) as files, not archive Thread Tools Search this Thread Display Modes
... which brings me to another question:

Does anyone know more about .webarchives and how that format is actually defined? If it's really a representation of the internal browser model—will it work with future versions, where this model might change?
 
I gotcha. I did some searching for more info on Webarchives, and I did find one app that will extract files from a webarchive. Not sure how well it works. http://www.macupdate.com/info.php/id/20643
 
A webarchive is a serialized form of the record of responses used to create a webpage.

Basically, as each resource is requested (via a image tag, a subframe, javascript, or even a flash plugin request) when the response comes back from the server, the request-response pair is stored in an object which can be serialized as a data file (webarchive). Then if the webarchive is loaded, as each resource reloads, if the same requests are made, instead of going to the server, the data is loaded from the archive instead.

It doesn't actually store all the state of javascript or plugins (hard to do in the first case, and not part of the api for the second).
 
Quote:
Originally Posted by zottel
... which brings me to another question:

Does anyone know more about .webarchives and how that format is actually defined? If it's really a representation of the internal browser model—will it work with future versions, where this model might change?
Since all of WebKit is now open source, you can look at the code for yourself and see exactly how webarchives are defined and created--and since they are including the full history in the public repository, you should always be able to read or write any version of webarchive were they to change them in the future.
 
Well, I'm back and...

...not only did I cough up the 9,95 for the November-sale OmniWeb, albeit I don't even have a Mac to run the latest version (I have a sweet spot for this browser, dunno why)...
Quote:
Originally Posted by Forrest
I just tried it in Firefox, it doesn't work. If the page is very simple, I'm sure it probably works. I rarely says absolutes, but I would say it's impossible for such a feature to be reliable.
... I did also dust off my b/w G3 (running Jaguar, which is why I can't run OW 5.5) and saved more than half a dozen web pages (w/images) in Firefox (0.9!), transferred the folders and files to a Windoze machine and looked at them (while offline) in IE (6) and others. All but one (my Gmail inbox, I didn't seriously expect that to save correctly) showed up with the content intact. This included the Omnigroup homepage and the OmniWeb features page. The page layout sometimes wasn't reproduced like the original, but I don't care for that. And I know that Opera would have saved it even better.

The main reason for the request was cross-platform and future accessibiltiy. It's the same reason why I prefer to keep my notes in plain-text. Web archives are a joke. PDFs are something completely different than the page itself. I'm used to dig into the source code of pages I've saved and add remarks or do other adjustments (I somtimes even run search-replace operations, to correct errors; this is not about all-English pages, after all). I can't do that with PDFs. And if I try to save/print as an A4 PDF, more often than not the margins of the page will be cut off. I had actually started with trying to archive everything as PDF, but soon abandoned that way.
So I want to uphold my request: Please add a feature that saves web pages as individual files together with their adjacent images etc.

T.
 
I know a easy way to do that ,LOL
just save the entire page into a image.
using the system Print Screen Key is not a good idea for it can only record the screen , i am using ACA Capture ,it can also capture the other part of the webpage outside the screen.
But if the webpage is saved into image ,it can't be split anymore.
 
DanielSmith, in which way would that be better than to save as PDF?

AFAICT, saving as PDFs would do this just as well, but the major points in this thread were that:
  • web archives are proprietary to WebKit browsers and not cross-platform (there aren't any non-Mac WebKit browsers)
  • while PDFs are kind of cross-platform,
    • they don't preserve links in the pages
    • I like to edit the source code of saved pages (add comments or correct errors, even edit links to additional images etc.)
    • some web pages don't print well at all, some PDF "printouts" have their margins cut off some text etc. etc.
 
(timb, this won't help your needs, but I'm posting it as a FYI).

If you Save as PDF (hold option down as you do a Save as...), rather than printing as PDF, they should save with the formatting preserved - note, this method generates a single page PDF file of the site, so if it exceeds the boundary of a single page of paper in the print version, this won't occur. It is useful for sites that need a lot of vertical scrolling to view, if you don't want them to be split at inconvenient locations in the text. However, it isn't so useful if you do actually want a hard copy printout.

I'm hoping Apple will allow the links to remain live in their PDFs in the next version of OS X.
 
An ambitous (and knowledgeable) person could write an AppleScript doing that provided that OmniWeb could list all of the resources on a web page like Firefox does. Then save all of those files into an Archive (Apple zip file).
 
 


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Saving images results in php file Handycam OmniWeb Bug Reports 2 2009-08-05 04:55 AM
Saving files katherine OmniFocus 1 for Mac 1 2008-08-31 12:02 PM
Remember last selected folder while saving web pages Tiggar OmniWeb General 1 2007-12-14 09:27 AM
Saving complete web pages (with images)? timb OmniWeb General 12 2006-11-29 10:41 AM
Local pages, source editor/viewer, and saving joragan OmniWeb Bug Reports 2 2006-04-26 08:30 AM


All times are GMT -8. The time now is 11:09 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.