On Wed, 2012-02-22 at 13:42 +0100, Stephan Bergmann wrote:
So, if we ever wanted to extend the new facilities to also
support UTF-8 string literals, but would want to keep the performance
benefit for the ASCII-only case, we could not offer the same simple syntax
Sure. On the other hand, we have:
git grep RTL_.*ASCII | wc -l
45122
of this sort of thing that we know are ascii-only, and relatively fewer
utf-8 strings (none that I can think of off hand).
And of course it would also work to syntactically optimize the ASCII
case (as we would do now) and add the indirection only for the UTF-8
case (at the expense of some ugly asymmetry).
I guess so. Of course, I like the idea of making UTF-8 a 1st class
citizen in rtl::OUString-land - it would be nice not to worry so much
about odd-ball character encodings, and assume that all char *'s are
UTF-8 in many ways.
Of course, it would be even more wonderful, if, with some
template-magic, we could generate static rtl_uString structures that
would end up in the .rodata section, and got heap allocated only when
copied, with utf-8 -> UCS2 conversion on during a (reasonable time)
compile ;-) but that's a pipe-dream I suspect.
Perhaps better to slowly move entirely to utf-8 strings anyway, which
brings us back to your proposal ;-) it is hard to see a benefit of UCS-2
really.
ATB,
Michael.
--
michael.meeks@suse.com <><, Pseudo Engineer, itinerant idiot
Context
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.