Hi Peter,
On Tue, 2011-04-12 at 23:28 +0200, Peter Jentsch wrote:
Am Montag, den 11.04.2011, 10:48 +0100 schrieb Michael Meeks:
Hi Peter,
On Mon, 2011-04-11 at 00:11 +0200, Peter Jentsch wrote:
    Oh - completely :-) I'm not disagreeing, just trying to find someone
who you can work with - so eg. which component: Calc, Writer, Impress
are you most interested in ? :-)
Well then, that's Writer. 
Then the code for the filter is sitting in two places:
  * import is in the writerfilter module
  * export sits in sw/source/filter/ww8
I'ld say that the easiest to get started with is the export filter...
but it has much less bugs and missing bits. The idea for the import
filter is the following:
  * a tokenizer sends tokens to the domain mapper
  * the domain mapper is the one actually doing the job on the document
The OOXML tokenizer's code is located in writerfilter/source/ooxml and
the domain mapper is located in writerfilter/source/dmapper. The OOXML
tokenizer is pretty complex to understand at first sight. Here are some
keys to understand it:
  * sax handlers are generated from an XML description of the spec
(model.xml file). This generation is done using some of the many xsl
files in the ooxml folder.
  * The generated handlers all end up calling some more method in a
ContextHandler defined in the ooxml folder.
Some other infos (not much) can be found on that page of the OOo wiki:
http://wiki.services.openoffice.org/wiki/WriterFilter
    I guess the best thing to do is, either to look for OOXML import or
export bugs - which often are disguised round-trip interop problems, I
imagine we have a number of them in bugzilla. Failing that, I'm sure we
have a number of guys interested in interop problems there that would
love to have your help :-)
I'll just have a look at the ooxml filters and try to figure what's
happening there and then take a stab at a bug, and then see where I want
to go from there. 
There are quite some bugs on that and they aren't necessarily easy to
handle. A nice start would be to fix some of the differences between
OOXML ISO standard and OOXML Ecma v1 standard: those differences often
include easy to hack things.
If you have questions, feel free to ping me on IRC; my nick is
cbosdonnat.
Regards,
--
Cedric Bosdonnat
Context
- Re: [Libreoffice] Excel 2003 XML format · Cedric Bosdonnat
 
  Privacy Policy |
  
Impressum (Legal Info) |
  
Copyright information: Unless otherwise specified, all text and images
  on this website are licensed under the
  
Creative Commons Attribution-Share Alike 3.0 License.
  This does not include the source code of LibreOffice, which is
  licensed under the Mozilla Public License (
MPLv2).
  "LibreOffice" and "The Document Foundation" are
  registered trademarks of their corresponding registered owners or are
  in actual use as trademarks in one or more countries. Their respective
  logos and icons are also subject to international copyright laws. Use
  thereof is explained in our 
trademark policy.