I am the editor of a document [the IEEE 754-2008 standard] that was created
around 15 years ago (using OpenOffice), and has had nearly 200 drafts, a
number of editors, and countless edits. It was last changed in 2008, but is
now about to go though a new revision cycle.
I was delighted to find that LibreOffice handled the 2008 .odt file almost
perfectly, with only 7 errors (all were weird spurious empty reference tags,
of unknown provenance, that OpenOffice quietly ignored).
While identifying and removing those from the content.xml, I noticed that
there are hundreds (possibly thousands) of redundant tags. These are
typically in the context: <span whatever>text1</span><span
whatever>text2</span> where 'whatever' is identical, and either or both
'text1' or 'text2' may be empty.
It there a tool to clean these up? I could write one myself (I recently
wrote an XML parser) but if one already exists ...
Many thanks -- Mike Cowlishaw
[Apologies if this is a duplicate .. I tried it on askLibo some time ago but
it is still "awaiting moderation".]
To unsubscribe e-mail to: firstname.lastname@example.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Impressum (Legal Info)
: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (MPLv2
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our trademark policy