On Thu, 10 Apr 2014 20:21:06 -0500
Joe B <firstname.lastname@example.org> wrote:
set path=%path%;C:\Program Files (x86)\LibreOffice 4\programfor %f in
(*.odg) do (
soffice.exe --headless --convert-to pdf --outdir "C:\tmp" %f)
Let me format that a bit better, tweak it slightly, and change it to
match your requirements (although you must still change "C:\tmp" to
your preferred location):
1) set path=%path%;C:\Program Files (x86)\LibreOffice 4\program
2) for %f in (*.htm) do soffice.exe --headless --convert-to pdf --outdir
Sorry if this wordwraps. It's meant to be only two lines, and I have
numbered each line, which is not supposed to be part of the line.
I do not know what language this is.
This is what used to be DOS, and is now cmd.com under Windows, which
provides a command line with the same syntax as DOS. On *nix systems
this is called a shell script, in the windows world it's a batch file,
and sometimes also referred to as shell scripting. See here for an
This is therefore not a language per se, but rather commands that the
command line interpreter understands.
So this could either be typed into the command line one line at a time,
pressing enter after each line, or this could all be put in a file,
called a batch file and given the ".bat" extension, and run as one
single item. Either way is almost exactly the same. A batch file is
simply a plain text file of DOS commands (I'm going to call them that
for simplicities sake) that cmd.com knows to execute by running each
line as if it were typed into the terminal directly. The only
difference in this instance is that if the commands are run line by line
from the command line, using "%f" as above is fine, but if they are in
a batch file instead, you need "%%f", so the second line becomes:
2) for %%f in (*.htm) do soffice.exe --headless --convert-to pdf
--outdir "C:\tmp" %%f
Here is my understanding of the code. The writer first adds the
LibreOffice directory to the "path" environment variable. Path is a
Windows operating system environment variable containing special
directories. These directories tell Windows where to look for
executable files. Thus, any executable file that is in a folder, that
is in the "path" environment variable, can be run at the command
prompt by simply typing its name, without having to specify exactly
where it is. For example, typing "soffice" (just the executable
file's name) at the command prompt, instead of "C:\Program Files
(x86)\LibreOffice 4\program", will open soffice.exe. It makes using
the command line simpler and quicker. In that way, the writer can
simply start his code in the for loop with "soffice".
What I don't understand is, how does the batch file know where to
look for the input files? All that the batch file is given is an
iterator variable %f.
The batch file would be the whole thing, so the "batch file" as you put
it isn't given anything, it's just run. Line two can be broken up into
the following fragments:
1) for %f in (*.htm) do
2) soffice.exe --headless --convert-to pdf --outdir "C:\tmp" %f
The first fragment is a "for" command, which is a built in DOS command,
i.e. cmd.com knows what to do with it without needing to run a program.
Almost anything you type at the command line is either a builtin
command or the name of a program to run. See here for more info on the
The for command iterates over a list and executes a command for each
item in the list.
The for command consists of the keyword "for", a variable to hold each
iteration of the list, the "in" keyword, a list to iterate over in
brackets, and the "do" keyword, and then a command to execute for each
item in the list (the second fragment). Here the variable is "f", and to
tell DOS that it is a variable, we need to precede it with a percent
sign, or two if it is in a batch file.
The for loop will execute the second fragment for each item in the
list, and each time the "%f" will hold the next item in the list. In
this case the items in the list are all the files with a ".htm"
extension, so each time the second fragment is run, the "%f" will hold
the name of the next file with an ".htm" extension.
The list can be given as, say, a simple list, like so
"(file1.htm file2.htm file3.htm)"
but that would mean typing out all the filenames. By using a wildcard
character (see: http://www.ahuka.com/?page_id=31) DOS knows that this
means the list consists of all the files in the current directory that
have the ".htm" extension.
The second fragment, therefore, is run multiple times, each time with a
different filename, and does exactly what you would expect.
This iterator variable ideally takes on the
values, one by one, of the file names that end with .odg (in this
case). But how does the command line know where to look to find
those .odg files to convert in the first place?
From the above I hope it is clear that it comes from the for loop,
specifically from the "(*.htm)"
Remember, the "dir" command is another builtin, and it takes a
list of files and displays each one with its size and other attributes.
So you can either give "dir" a single file to list (or multiple files
separated by space), or give it a filename with wildcards to match more
than one file. DOS actually replaces the filename pattern that contains
the wildcards with a list of files that match and hands that list to
"dir". In the same way the "for" command takes a list of files, and DOS
simply translates that wildcard pattern into a list of files for us
before handing it to the "for" command.
It's called shell globbing, and basically it means that the shell
(cmd.com in this case) will translate a filename with wildcards into a
list of filenames that match, before actually calling the builtin or
program. It doesn't work everywhere, in some places the shell doesn't
glob, and in others even though it does, the program or builtin
doesn't accept more than one single file and so won't work despite
the shell actually doing the globbing, but most places you need to give
a filename you can use wildcards and the shell will glob it, and any
program or builtin that works with a list of files will be happy.
Thanks again, and sorry for so many questions. I just feel so close,
and I've been working on this problem for days.
No worries. Hope this makes it all clear to you.
I'm heading for bed now, so if there is anything else, please feel free
to ask, but expect a little longer delay in my response this time :)
To unsubscribe e-mail to: email@example.com
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
- Re: [libreoffice-users] Using LibreOffice on the command line to batch convert .htm files to .html files (continued)
Impressum (Legal Info)
: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (MPLv2
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our trademark policy