Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index


Hi :)
We call it "headless mode".  Errr, which OS are you using?  Is it a Windows
or a Gnu&Linux or Mac?

Headless mode can be scripted and there might even be a thread in the
archives that shows a decent script worth copying.  I think the better way
is to try using LibreOffice on the command-line and get it doing more and
more until you've figured it out.  For example does
soffice
or
lowriter
work from the command-line?  On my Gnu&Linux both work but some OSes might
be limited to using just 1 of those.  Then try, for example
lowriter --help
to get a quick cheat-sheet of options.

Hopefully people on this list can help but there might also be
documentation at
https://wiki.documentfoundation.org/Documentation/Other_Documentation_and_Resources#Programmers
or scroll up a bit to see what is in the "Corporate Users" section of the
page.


Attachments don't get to the mailing-list anyway!  You can use Nabble to
upload them to a central place so that people can choose to look if they
want.


I would try to keep the original documents in MS format so that if there is
any problem with some tiny subset of all the ones being converted then you
can focus on those and do them with a bit more finesse.  However from Doc,
Xls etc to Odt, Ods etc should work reasonably well.

It's the DocX, XlsX etc that is a bit more unpredictable thanks to MS's
constant changing of that format (currently on at least 3 different
"transistional" versions and at least 1 "strict" none of which seem to
fully comply with their ISO promise).  Even with those i think a
batch-process using a scripted headless mode is the best plan and then deal
with individual oddities later.

Regards from
Tom :)



On 10 April 2014 13:30, Joe B <paperbag76@gmail.com> wrote:

Hello all,

This is my first post.

I am working on migrating a website.  I am trying to convert many files
written in an old version of MS Word, which were then saved as old
Microsoft 2002/2003 XML files.  The files were saved using an .htm
extension.  The files are filled with Microsoft xml crud. (I will just
refer to them as .htm files for the rest of this e-mail)

I found a simple solution, in simply opening the file in LibreOffice
Writer, and re-saving the file in HTML Document (Writer) (.html) format.
Now the files work great.

I don't want to do this one file at a time obviously, as there are hundreds
of these .htm files.  I am trying to figure out a way to do this for
multiple files in a folder...I think the term is "batch processing".

In other words, have a script that will:
1. iterate through each .htm file in a folder
2. open the file in LibreOffice Writer
3. save the .htm file in HTML Document (Writer)(.html) format
4. close the file
5. iterate over all the remaining files in the folder until all files have
had their formats changed

Is there a way to do this via a command line script.  Or by creating a
batch file?

I'm sorry, I'm a bit of a novice when it comes to the command line or batch
files.  I know how to open LibreOffice Writer.exe from the command line
with one argument, which will open that document, but that's about it.

I have some experience in other scripting languages, like Python, Perl,
etc, but not windows scripting.  I am having a very difficult time getting
this to work in Python, so I thought I would come here and try to ask for
guidance.

I could attach a copy of one of the .htm files that I am converting if that
would help, but don't want to attach a file in my very first e-mail.

thank you,
Joe
paperbag76@gmail.com

--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems?
http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be
deleted



-- 
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.