Date: prev next · Thread: first prev next last
2012 Archives by date, by thread · List index


On 10/01/2012 01:02 PM, Noel Grandin wrote:
That was something I was thinking about the other day - given than the
bulk of our strings are pure 7-bit ASCII, it might be a worthwhile
optimisation to store a bit that says "this string is 7-bit ASCII", and
then store the string as a sequence of bytes.

cf. <https://wiki.documentfoundation.org/Development/LibreOffice4#General_changes_2>: "replace rtl::OUString with a UTF-8 string for better space efficiency, and Unicode coverage."

The latest Java VM does this trick internally - it pretends that String
is stored with an array of 16-bit values, but actually it stores them as
UTF-8.

Java's modified UTF-8, presumably.  (Me the nitpicker :)

Stephan

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.