Hi Adam, [ Not that I know too much about wpg/wps, but hopefully better than no answer. ] On Mon, Apr 29, 2013 at 05:11:51AM -0700, Adam Fyne <adam.fyne@cloudon.com> wrote:
I have created a DOCX document (via Word 2013) and it has a simple cover page (attached to this mail). After hacking a bit into the XML of the document - I see there are some tags that maybe are not supported by LibreOffice? These are the tags: * wpg:wgp [WordprocessingGroup<http://msdn.microsoft.com/en-us/library/documentformat.openxml.office2010.word.drawinggroup.wordprocessinggroup.aspx>] * wpg:cNvGrpSpPr * wpg:grpSpPr * wpg:grpSp * wps:wsp [WordprocessingShape<http://msdn.microsoft.com/en-us/library/documentformat.openxml.office2010.word.drawingshape.wordprocessingshape.aspx>] * wps:cNvPr * wps:cNvSpPr * wps:spPr * wps:bodyPr * wps:txbx
I'm guessing these are wrappers around drawingml tags, i.e. if you search for cNvSpPr, you'll see there is already such a tag in the drawingml namespace, and so on.
After talking with some LibreOffice developers working with me - I was told by them that LibreOffice supports the 2006 Word Scheme (used by Office 2007), And does not support the 2010 Word Scheme. Apparently - these tags are part of the 'newer' 2010 Word Scheme.
I think this is quite true -- I mean we're typically just add whatever is missing to fix a bug, and the result is that the newer tags introduced by newer Word versions are less supported. However in this case, I see that all the wpg markup is wrapped in a mc:AlternateContent / mc:Choice, and there is a fallback VML markup as well, which is known to work. I would first suggest checking if these mc: tags are handled correctly.
So my first step would be to make sure that the import\export process doesn't corrupt these values (maps them to some LibreOffice concept). The second step would be to support these features in the core.
As long as all you need is cover pages with simple rectangles and so on, it's better to focus on looseless VML roundtrip, that's already something we to some extent. The VML import code lives here: oox/source/vml/ the export code is here: oox/source/export/vmlexport.cxx Last time I checked, the VML import code was much better, than the export one, so I would expect more issues on that side. The "LibreOffice concept" is the drawinglayer, speaking from the experience I have from earlier bugs, you probably won't have to touch its document model or rendering, these VML issues were all in the filters so far.
1. Does LibreOffice indeed not support the 2010 Word Scheme in general?
I imagine most of the new tags are introduced in existing namespaces, so I can fairly imagine some of the newer (2010, 2013) tags are supported as well -- but I did not check this deeply.
2. Does LibreOffice indeed not support the above tags?
Right, I think for the wps/wpg namespace, we just read the VML part, but I see no related unit test, so it may be bitrotted with time, without anybody noticing it. Also note that drawingml in general is very much supported, see oox/source/drawingml/ and oox/source/export/drawingml.cxx. Also, I just tested that in case you create a document from scratch in Word 2010, insert some picture from clipart and save, then it's imported just fine -- and that's using drawingml exclusively. So some hooking of the drawingml import into writerfilter is already in place.
3. How difficult would it be to match each of these tags to an appropriate LibreOffice object?
I expect that most of the VML-related problems are just filter problems and you can avoid diving into the drawinglayer for this.
4. Which files should I look at that hold this information?
For the VML filter, see above -- for the docx filter in general, see writerfilter/source/ooxml/ (tokenizer), writerfilter/source/dmapper/ (domain mapper), for the export see sw/source/filter/ww8/docx*, but possibly you already know these. :-) Hope this helps, Miklos
Attachment:
signature.asc
Description: Digital signature