View Single Post
Why should it be so difficult? You only have to recode all tags that point to stuff used in the page so that they reflect the relative path to the file on the hd. And, in order to be sure that links will still work, recode all links that point away from the page from relative to absolute. That's all—or am I missing something?

The only difficulties I can imagine are to decide what to call the html file if the source was dynamically generated, links might not work anymore if they contained session information, and problems with flash or other content that will run in the browser that might load other stuff when it's started.

But these are problems relevant to the generation of .webarchives, too.

Remember, we're talking about a single web page here, not about some part of the file tree of an entire website.

wget (commandline websucking tool) has been able to do this since I used it for the first time, which was more than 5 years ago.