Date: prev next · Thread: first prev next last
2017 Archives by date, by thread · List index


On 03/01/2017 12:05 PM, Eike Rathke wrote:
On Wednesday, 2017-03-01 10:34:04 +0100, Stephan Bergmann wrote:
(1)  If the input is assumed to be an arbitrary sequence of Unicode scalar
values (i.e., may contain noncharacters, even despite the caveat that those
should never be interchanged), the below invalidChar handling might want to
also watch out for U+FFFE and U+FFFF.

Also if UTF-8 encoded? (as we write OString/chars there..)

Yes, the XML requirement is on the Unicode (or ISO/IEC 10646) characters, regardless how they're encoded in a given file. Though it's probably a bit difficult to cram that check into FastSaxSerializer's design. And, again, may even not be relevant if the input must not contain any noncharacters anyway. (In configmgr, I didn't bother to ensure that at any higher abstraction level, and simply make sure arbitrary sequences of Unicode scalar values are properly encoded for XML's requirements.)

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.