I will second that thought about "affect the whole document". I know from a personal experience that some of these document tags may not look like they are needed, but may cause some "troubles" later in the document if removed or modified.
The first time I saw such an issue was creating a document [HTML based] in one WYSIWYG editor and then editing it in a completely different, more complex, one. "XML" based documents seem to have the same issues, depending on the software used in editing the various revisions of the document. The second was with .odt files, with OOXML based ones as the third file type seen with this same issue[s].
ALWAYS keep a "static" original copy of the file to compare the attempts with the "tag edited" version. If you make a simple "error" in removing a single tag [which I know from experience] you could completely mess up your document's format.
Also, sometimes I find that when you edit an older document, from an earlier file format version, with a package that uses a newer, modified, fix, extended, version of the file format [say ODF, OOXML, or any other file format] there may be some compatibility reasons for adding "extra blank tags" to a document that resolves some unknown formatting issue.
So, that said. . . .What packages have been used to edit the document in the past - version specifics included. The earlier versions of OOo [whatever number it was when OOo started reading/writing .doc files] did different things to a .doc formated document that looked a little different in LO 4.0.x, when I opened them to do some editing and saving them into both .doc and .odt formats. MS Word edits to a "document.odt" file then edited by LO 4.3.x [through 4.4.x] and then editing by AOO, then Word, then back to LO, can insert a lot of unneeded formatting "tags" in the document. I have seen this with LO to Word to LO to Word and back to LO document editing. I have seen this in .docx and .odt files. For such an "round-robin" editing cycle, I tend to ask for the file in some older format, like .doc, to stop the "padding of blank tags" or any other "extra stuffings" of non-needed formatting.
YESTom, I really hope someone has created some add on extension/filter to clean out all the unneeded "blank" tags and other formatting info. I really hated to do it myself. I almost rather save the document to a formatted .txt file extension and redo the "real formatting" over the manual removal of all of the "blank" formatting tags.
On 09/25/2015 03:47 PM, Tom Davies wrote:
Hi :) I don't know of an Extension that might do this but then i have never looked for one tbh. It might be smart to test that removing them doesn't affect the whole document before starting. Create a copy of the whole document, or even just a copy of the "contents.xml" and then try on a small sample. I guess you have already tried that though! Regards from Tom :) On 25 September 2015 at 20:42, Florian Reisinger <florei@libreoffice.org> wrote:Hi, One possible reason is hard formatting. By adding and removing a style hardcoded empty span tags can appear. However this should not happen when using styles (even for bold and so on). I read this some time ago. Hope that helps :) Am 25. September 2015 21:36:08 MESZ, schrieb Mike Cowlishaw < mfc@speleotrove.com>:I am the editor of a document [the IEEE 754-2008 standard] that was created around 15 years ago (using OpenOffice), and has had nearly 200 drafts, a number of editors, and countless edits. It was last changed in 2008, but is now about to go though a new revision cycle. I was delighted to find that LibreOffice handled the 2008 .odt file almost perfectly, with only 7 errors (all were weird spurious empty reference tags, of unknown provenance, that OpenOffice quietly ignored). While identifying and removing those from the content.xml, I noticed that there are hundreds (possibly thousands) of redundant tags. These are typically in the context: <span whatever>text1</span><span whatever>text2</span> where 'whatever' is identical, and either or both 'text1' or 'text2' may be empty. It there a tool to clean these up? I could write one myself (I recently wrote an XML parser) but if one already exists ... Many thanks -- Mike Cowlishaw [Apologies if this is a duplicate .. I tried it on askLibo some time ago but it is still "awaiting moderation".] -- To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted-- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. -- To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
-- To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted