Date: prev next · Thread: first prev next last
2013 Archives by date, by thread · List index


Hi all,

As announced earlier on this mailing list, I've recently been submitting patches, mostly to improve the MathML export. A couple of things remain to be reviewed/merged but after that I think LibreOffice MathML export will be reasonably good (although of course some MathML experts will say it can be still improved). At least, all the mathematical formatting done via the StarMath language should now be preserved when exported to MathML and Firefox and MathJax should be able to render the MathML correctly. I’ve also fixed some bugs with the XHTML export when the document contains mathematical formulas.

In order to test the new MathML export, to verify how other rendering engines could handle the MathML markup and to compare visual improvements, I’ve written a small ODT document that contains an overview of LibreOffice Math features:

http://www.maths-informatique-jeux.com/international/LibreOfficeMath.zip

The archive contains two directories with the results for LibreOffice 4.0 (the release version installed on my system) and LibreOffice 4.2 (with all my patches applied). Each directory contains the ODT file and the exported XHTML page. You can open that page with your browser and there is also a version that uses MathJax. I’ve included pdf documents showing the rendering with LibreOffice, Gecko and MathJax. I’ve also extracted the MathML files so that you can compare the code or verify the rendering in other engines. For example try "for f in `ls 4.2/MathML/*`; do mathmlviewer $f; done" to display all the MathML files in gtkmathview.

For the record, here is a (certainly non-exhaustive) list of things to improve:

1) expressions like "1+3x" are still incorrectly interpreted as "{1+3}x" (bug 66200). Well, I guess that won't affect the rendering but that might confuse accessibility tools.

2) arbitrary Unicode characters are generally interpreted as identifiers. If you want to define an operator, you must use the boper, uoper or oper commands. You can also create your own %xxx command and the MathML export will use the MathML operator dictionary to guess if that's an operator.

3) As I read the source code, Math does not seem to handle non-BMP characters (probably the reason for bug 66333)

4) For vec and widevec, the combining character "U+20D7 COMBINING RIGHT ARROW ABOVE" is used instead of "U+2192 RIGHTWARDS ARROW". The latter is recommended by the MathML spec and may renders better in MathML rendering engines but the former renders better in LibreOffice Math at the moment. Perhaps this could be changed when stretching is implemented correctly (cf bug 32362 comment 21)

5) The interpretation of alignment in Math is weird. Concretely, if you write "matrix{alignl blah blah blah blah blah blah ## alignl 3 over 10 + 7 over 10 = 1 }" to align the left side of the rows, this applies recursively and the numerator/denominator of the fractions will be aligned left too. In the MathML export, I only apply the alignment to the specified node, not to descendants. Better alignment was one of the four most missing issues listed by Thomas Lange.

6) The MathML export does not take into account the properties from the Format menu (except "Text mode"). Font and alignment properties could probably be handled in MathML. Other general spacing rules would be less accurate and reliable to do with MathML spacing elements and it's better to use the information from the Open Type MATH table for that, anyway.

7) The MathML import is not good and I suspect it is unlikely to be successfully used to import MathML formulas generated by third-party tools. LibreOffice Math is even not always able to import correctly the MathML it generates (when you remove the StarMath annotation). I added new MathML constructions in the export so rendering properties are now preserved at export but are still lost when importing it back (again, when you remove the StarMath annotation). I don't plan to improve this, especially if people always keep the StarMath annotation and if we move to MathML as the reference format in the future.

8) I added new MathML constructions that are from MathML2 and MathML3. I guess the menu should use the generic term "MathML" instead of specifying the obsolete version "MathML 1.0".

9) I've proposed two enhancements: an option to use MathJax (bug 66287) and a HTML5 export filter (bug 66044). I've submitted a WIP patch for the first and I suspect the second could reuse the existing XHTML filter. But I don't know where the code to call these XSLT style sheets and handle the dialogs is located.

10) As Khaled Hosny mentioned in a previous message, inserting a math object in Word is not really convenient. Also, setting inline/display mode by using Format => Text is a bit tedious. In general, integration of formulas in the surrounding text (line breaking, spacing, alignment) is not really good. Also, this is what the other missing issues listed by Thomas Lange were about.

For completeness and for people interested in details, here is the list of bugs I've worked on:

- Variable with coefficients not italicized <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=55853>

- Wide accents are not stretchy when exported to MathML <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66024>

- Blanks are not correctly exported to MathML <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66075>

- Improve grouping of binary operators <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66081>

- Commands wideslash, widebslash and overstrike are not correctly exported to MathML <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66086>

- Many mathematical symbols are incorrectly exported as <mo> operators <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66088>

- MathML export: avoid using combining characters for accents and diacritical marks <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66276>

- Use columnalign to implement matrix alignment <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66277>

- MathML export does not distinguish between inline and display equations <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66278>

- MathML export: use the operator dictionary <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66279>

- underbrace and overbrace are not stretchy when exported to MathML <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66281>

- Replace <mfenced> elements by equivalent <mrow>+<mo> constructions <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66282>

- phantom and stylistic commands generate useless elements <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66283>

- Incorrect unicode characters used in the "Brackets" <https://www.libreoffice.org/bugzilla/show_bug.cgi?id=66416>

- Incorrect removal of last line in SmXMLExport::ExportTable <https://bugs.freedesktop.org/show_bug.cgi?id=66575>

So LibreOffice users will hopefully be able to generate better Web pages with MathML and I’ll be more confident to recommend LibreOffice when people ask for a WYSIWYG editor on the MathJax mailing list!

Thanks,

--
Frédéric Wang
maths-informatique-jeux.com/blog/frederic


Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.