Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index


Hi :)
Wow!  I was about to point out that Gary is on Windows, and therefore has a
very limited command-line, but then i found that "wget" IS available for
Windows!!

From the Gnu.Org link i found that possibly the best link is;
http://gnuwin32.sourceforge.net/packages/wget.htm
because it has some back-ground information.  The;
https://eternallybored.org/misc/wget/
link is very bare-bones but does give the ".exe"'s for various systems as
zipped files or in one case as the uncompressed exe itself.

From the first link the "Install Instructions" are at;
http://gnuwin32.sourceforge.net/install.html

With an FAQ at;
http://wget.addictivecode.org/FrequentlyAskedQuestions?
(at a guess the "?" stops Internet Explorer from being able to get to that
page!)

The full guide/manual is at;
https://www.gnu.org/software/wget/manual/
The heading on that page refers to "The GNU Operating System".
<rant>
Which most people refer to as Linux but that page calls Gnu/Linux and then
describes how it is Gnu & Linux together that form the whole OS (although
yes the Gnu part is MUCH the larger part(s) of the whole OS) = so i tend to
refer to it as Gnu&Linux.  Really I think the Gnu people should wake up to
the fact that they have already lost this naming fight (only about 20-30
years ago!) and that although it is desperately unfair it just causes
confusion to mention Gnu this late in the game.
</rant>
Anyway, ignore the heading because if you look closely you'll see the
guide/manual is just for the Wget command, not 'the' entire OS!

Regards from
Tom :)




On 20 October 2015 at 20:41, <libreoffice-ml.mbourne@spamgourmet.com> wrote:

Gary Collins wrote:

I've just tried this in firefox. A couple of things:

1) access to the file save option seems to be indirect - but maybe there
is a way to customise this, I haven't got time to check at the moment; but

2) much more importantly - it does indeed save as a single file, but the
formatting is *awful* nothing like the original page, and pictures,
graphics, etc are not there - just links. That's not what I want to see
when  I open a webpage file, I want to see the page (more or less) the same
as it was originally when I opened it online.


I think Firefox is similar, but in SeaMonkey (based on Firefox) under File
Save As there are a couple of options:

- "Web Page, complete" saves referenced files images, stylesheets, etc. in
a folder alongside the HTML file and changes the references in the HTML
file to refer to the saved copies. It still sometimes misses some, I guess
if the references are generated dynamically by a script it can't reliably
predict what might be needed.

- "Web Page, HTML only" just saves the HTML file without all the other
resources. In that case, I think the references are left as they were
originally, so you'll see them if you have Internet access and they're
still available at the same URLs (or your browser has cached them). If not,
you won't get the images, stylesheets, etc.

Which raises another issue: I save pages for later viewing on a machine
that doesn't have an internet connection. I'm assuming that an internet
connection will be essential to follow any of the hyperlinks in the file,
which would render the format useless for my purposes.


You might want to look at wget:
  https://www.gnu.org/software/wget/
It's a command-line utility which can not only download web pages along
with referenced resources, just as the "complete" option above does, but
also recursively follow links - so you would be able to follow them
offline. I believe it mirrors the structure of resources from the server,
so for example if the same image is used on every page you only download it
once.

I guess it would still suffer the same limitations with dynamically
generated references.

Mark.



--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems?
http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be
deleted


-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.