Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index

I will second that thought about "affect the whole document". I know from a personal experience that some of these document tags may not look like they are needed, but may cause some "troubles" later in the document if removed or modified.

The first time I saw such an issue was creating a document [HTML based] in one WYSIWYG editor and then editing it in a completely different, more complex, one. "XML" based documents seem to have the same issues, depending on the software used in editing the various revisions of the document. The second was with .odt files, with OOXML based ones as the third file type seen with this same issue[s].

ALWAYS keep a "static" original copy of the file to compare the attempts with the "tag edited" version. If you make a simple "error" in removing a single tag [which I know from experience] you could completely mess up your document's format.

Also, sometimes I find that when you edit an older document, from an earlier file format version, with a package that uses a newer, modified, fix, extended, version of the file format [say ODF, OOXML, or any other file format] there may be some compatibility reasons for adding "extra blank tags" to a document that resolves some unknown formatting issue.

So, that said. . . .
What packages have been used to edit the document in the past - version specifics included. The earlier versions of OOo [whatever number it was when OOo started reading/writing .doc files] did different things to a .doc formated document that looked a little different in LO 4.0.x, when I opened them to do some editing and saving them into both .doc and .odt formats. MS Word edits to a "document.odt" file then edited by LO 4.3.x [through 4.4.x] and then editing by AOO, then Word, then back to LO, can insert a lot of unneeded formatting "tags" in the document. I have seen this with LO to Word to LO to Word and back to LO document editing. I have seen this in .docx and .odt files. For such an "round-robin" editing cycle, I tend to ask for the file in some older format, like .doc, to stop the "padding of blank tags" or any other "extra stuffings" of non-needed formatting.

Tom, I really hope someone has created some add on extension/filter to clean out all the unneeded "blank" tags and other formatting info. I really hated to do it myself. I almost rather save the document to a formatted .txt file extension and redo the "real formatting" over the manual removal of all of the "blank" formatting tags.

On 09/25/2015 03:47 PM, Tom Davies wrote:
Hi :)
I don't know of an Extension that might do this but then i have never
looked for one tbh.

It might be smart to test that removing them doesn't affect the whole
document before starting.  Create a copy of the whole document, or even
just a copy of the "contents.xml" and then try on a small sample.

I guess you have already tried that though!
Regards from
Tom :)

On 25 September 2015 at 20:42, Florian Reisinger <>


One possible reason is hard formatting. By adding and removing a style
hardcoded empty span tags can appear. However this should not happen when
using styles (even for bold and so on).

I read this some time ago. Hope that helps :)

Am 25. September 2015 21:36:08 MESZ, schrieb Mike Cowlishaw <>:
I am the editor of a document [the IEEE 754-2008 standard] that was
around 15 years ago (using OpenOffice), and has had nearly 200 drafts,
number of editors, and countless edits. It was last changed in 2008,
but is
now about to go though a new revision cycle.

I was delighted to find that LibreOffice handled the 2008 .odt file
perfectly, with only 7 errors (all were weird spurious empty reference
of unknown provenance, that OpenOffice quietly ignored).

While identifying and removing those from the content.xml, I noticed
there are hundreds (possibly thousands) of redundant tags. These are
typically in the context: <span whatever>text1</span><span
whatever>text2</span> where 'whatever' is identical, and either or both
'text1' or 'text2' may be empty.

It there a tool to clean these up? I could write one myself (I recently
wrote an XML parser) but if one already exists ...

Many thanks -- Mike Cowlishaw

[Apologies if this is a duplicate .. I tried it on askLibo some time
ago but
it is still "awaiting moderation".]

To unsubscribe e-mail to:
Posting guidelines + more:
List archive:
All messages sent to this list will be publicly archived and cannot be
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail
To unsubscribe e-mail to:
Posting guidelines + more:
List archive:
All messages sent to this list will be publicly archived and cannot be

To unsubscribe e-mail to:
Posting guidelines + more:
List archive:
All messages sent to this list will be publicly archived and cannot be deleted


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.