Michael mentioned to me that he had seen cases where
"RTL_CONSTASCII_STRINGPARAM" was accidentally used instead of
"RTL_CONSTASCII_USTRINGPARAM". Here's the problem...
#define RTL_CONSTASCII_STRINGPARAM( constAsciiStr ) constAsciiStr,
((sal_Int32)sizeof(constAsciiStr)-1)
which turns into two
#define RTL_CONSTASCII_USTRINGPARAM( constAsciiStr ) constAsciiStr,
((sal_Int32)(sizeof(constAsciiStr)-1)), RTL_TEXTENCODING_ASCII_US
which turns into three arguments, the third one being the encoding.
The problem arises when someone is using the old "String" class because
it has a load of constructors, two of which are...
UniString( const sal_Char* pByteStr, rtl_TextEncoding eTextEncoding,
sal_uInt32 nCvtFlags = BYTESTRING_TO_UNISTRING_CVTFLAGS );
and
UniString( const sal_Char* pByteStr, xub_StrLen nLen, rtl_TextEncoding
eTextEncoding, sal_uInt32 nCvtFlags =
BYTESTRING_TO_UNISTRING_CVTFLAGS );
So if someone uses RTL_CONSTASCII_USTRINGPARAM the right thing happens
and the better fitting second ctor is selected, char*, len, encoding all
filled in correctly.
On the other hand if someone uses RTL_CONSTASCII_STRINGPARAM, then the
better fitting first ctor is selected and the *ENCODING* argument is
filled in with the length of the string. Puke.
I should note that none of these errors are new or have been introduced
recently, but have been lurking in the code for quite a while.
Attached is a patch to the class String class which adds a higher
constructor that exactly matches the output of
RTL_CONSTASCII_STRINGPARAM but marks it private, detecting this misuse
at compile time, which should make this impossible to happen again, but
retain any correct uses of the first constructor where a real 16bit
rtl_TextEncoding is used, and not an implicitly downcasted sal_Int32.
I've applied this patch, and fixed a big pile of incorrect code that
falls out of it. What I will have missed is code which is only compiled
for non-Linux platforms, e.g. MacOSX or Windows specific code. So if you
get a compile failure about something being private in the String class
on those platforms the fix is probably trivially changing a STRINGPARAM
to USTRINGPARAM, otherwise let me know and I'll have a look.
C.
diff --git a/tools/inc/tools/string.hxx b/tools/inc/tools/string.hxx
index cb6b455..25cb2d4 100644
--- a/tools/inc/tools/string.hxx
+++ b/tools/inc/tools/string.hxx
@@ -474,6 +474,8 @@ private:
void operator +=(int); // not implemented; to detect misuses
// of operator +=(sal_Unicode)
+ //detect and reject use of RTL_CONSTASCII_STRINGPARAM instead of RTL_CONSTASCII_USTRINGPARAM
+ TOOLS_DLLPRIVATE UniString( const sal_Char*, sal_Int32 );
public:
UniString();
UniString( const ResId& rResId );
Context
- [Libreoffice] RTL_CONSTASCII_STRINGPARAM vs RTL_CONSTASCII_USTRINGPARAM and String/UniString · Caolán McNamara
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.