Date: prev next · Thread: first prev next last
2016 Archives by date, by thread · List index


On Sat, Feb 13, 2016 at 6:17 AM, Khaled Hosny <khaledhosny@eglug.org> wrote:
I’m wondering if it is possible to rename sal_Unicode, which is actually
a unsigned 16 bit integer and thus can’t fit any Unicode character

s/any/every/ ?

, to
some less confusing name like sal_Ucs2 or even just use sal_uInt16 (but
please not sal_Utf16 which would give the illusion that surrogate pairs
and handled in a special way which I don’t think is true).

Actually surrogate pairs are supported (or at least the intent is there :-) ).

include/rtl/character.hxx:inline bool isHighSurrogate(sal_uInt32 code) {
include/rtl/character.hxx:inline bool isLowSurrogate(sal_uInt32 code) {
include/rtl/character.hxx:inline sal_Unicode getHighSurrogate(sal_uInt32 code) {
include/rtl/character.hxx:inline sal_Unicode getLowSurrogate(sal_uInt32 code) {
include/rtl/character.hxx:inline sal_uInt32
combineSurrogates(sal_uInt32 high, sal_uInt32 low) {
etc...

So yeah sal_Unicode is utf16, and it is quite common that utf16 is
abusively called 'unicode'.
'Unicode' itself does not denote any specific encoding structure,
hence utf-8, utf-16 and utf32 names, the latter 2 coming in BE and LE
flavour.


I count only ~7000 usages across the code base, so that is not such a
huge task.
Internally it is doable, externally that is more of a problem, since
sal_Unicode is part of the stable external API.
The best you can do is to have an internal 'alias' for it.
It may be indeed useful, for more clarity, to have typedef to be
explicit about things, sal_utf8, sal_utf16, sal_utf16be, sal_utf16le,
sal_utf32, sal_utf32be, sal_utf32le

Norbert

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.