[Libreoffice] writerfilter vs XSL

Miklos Vajna <vmiklos -AT- frugalware.org>
Fri, 20 May 2011 12:37:53 +0200

On Thu, May 19, 2011 at 12:52:41PM +0200, Cedric Bosdonnat <cedric.bosdonnat.ooo@free.fr> wrote:

As you'll work on the tokenizer, I think it would be nice to introduce
some kind of tokens dumper replacing the dmapper that would dump what
goes in the dmapper. That would possibly provide some way to isolate
whether the import problem comes from the tokenizer (specific to each
format) or the domain mapper (that would impact all handled formats).


Yes, that makes sense.

You would then have a much more reliable way to test that your tokenizer
is working... but that wouldn't help testing the domain mapper. To test
that one, I think that mostly conversions like those you are explaining
are helping.

OK.

(I already heard of the xml dumper for the rendered layout, is there
something similar for the internal document model?)


Yes, the ODF is a pretty good representation of the internals... though
we could surely implement something nearer from the actual data
structures. Let me know if it would be of any use to create such a
dumper... I'm sure we could come pretty quickly to something useful.


Fine, I'll use ODF for now, then if it turns out to be too much trouble,
we can still work on a dumper.

Other question: writerfilter seems to use a lot of XSL to extract
required data from the spec, we agreed that this is a problem as XSL is
hard to maintain. Now if I follow this way, RTF would introduce another
bunch of XSL. :)

So, what could be a solution here? Possible ideas from me:

- even with its problems, we have nothing better, introducing new XSL
  code for RTF is not the best, but let's live with it. (the
  conservative one)

- write C++ code to do the transformations build-time (the "i don't know
  any scripting languages" one)

- use perl or Python to do the transformations (my perl-fu is weak, but
  it's doable; I would vote for Python, but not sure about reusing our
  internal python in the build system is a problem or not)

Thanks.

Attachment: pgpDlEjifhlGs.pgp
Description: PGP signature

Context

[Libreoffice] Testing writerfilter import · Miklos Vajna
- Re: [Libreoffice] Testing writerfilter import · Cedric Bosdonnat
  - [Libreoffice] writerfilter vs XSL · Miklos Vajna

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.