Date: prev next · Thread: first prev next last
2017 Archives by date, by thread · List index


On 01/20/2017 03:25 AM, Takeshi Abe wrote:
Preparing a patch for tdf#105382 [1], I come across a question about
character encoding for the path part of a URL representing a
com.sun.star.frame.XStorable's location.
I wonder if the original (before percent-encoded) path of such a URL can
be in an encoding other than UTF-8 or even in a different charset due
to e.g. a code page of some legacy filesystems.
Is it possible?
And, if so, is there any reasonable way to tell the encoding?

A conforming URL itself, by definition, is written with a subset of ASCII-only characters.

For file URLs, there never was a definition how to interpret the octets encoded in the URL's path component, so OOo/LO came up with the convention of always interpreting those as UTF-8. (So any code that converts between file URLs and native pathnames needs to do that mapping between UTF-8 and the relevant native pathname encoding, which LO assumes to be as reported by osl_getThreadTextEncoding.)


Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.