[Libreoffice] Ridiculous xml?

Ryan Jendoubi <ryan.jendoubi -AT- gmail.com>
Fri, 29 Jul 2011 11:30:43 +0100

Hi all,

Recently noticed something very odd about the contents of content.xml.It looks like a text span has been placed around every individual word,with a different one around every individual space:

</text:span><text:span text:style-name="T1">Web</text:span><text:spantext:style-name="T2">. </text:span><text:spantext:style-name="T1">But</text:span><text:span text:style-name="T2"></text:span><text:span text:style-name="T1">it</text:span><text:spantext:style-name="T2"> </text:span><text:spantext:style-name="T1">is</text:span><text:span text:style-name="T2"></text:span><text:span text:style-name="T1">not</text:span><text:spantext:style-name="T2">


...etc.

I can only imagine this makes the files somewhat bigger than I wouldhave thought was necessary?

More of a problem for me is that I was using a shell script to inflatecontent.xml and grep it for a certain string within the text. I wasaccounting for odd whitespace, but obviously this mad tagging thwartedsuch a simple approach. Have now adapted with a perl script to strip allxml tags before grepping, but I'm still curious about why content.xmlappears this way?

Might it be because the file was imported from .doc format? Is it atransformation "bug" of some kind?


Bests,

--Ryan

Context

[Libreoffice] Ridiculous xml? · Ryan Jendoubi
- Re: [Libreoffice] Ridiculous xml? · Eike Rathke
  - Re: [Libreoffice] Ridiculous xml? · Ryan Jendoubi
    - Re: [Libreoffice] Ridiculous xml? · Ryan Jendoubi
      - Re: [Libreoffice] Ridiculous xml? · Michael Meeks

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.