Re: Possible extensions to OUString class

Matteo Casalin <matteo.casalin -AT- libreoffice.it>
Thu, 31 Jan 2019 08:04:49 +0100

Hi Stephan,

On 1/30/19 10:40 PM, Stephan Bergmann wrote:

On 30/01/2019 22:17, Matteo Casalin wrote:
I'm working on improving code that calls getToken (e.g. using itsversion with index, or using other OUString functions in its placewhen possible).One thing that I noticed is that there are a lot of calls in the formgetToken().toInt# which require memory management just to obtain avalue that could be generated by the original OUString. Similarly (butless frequently), some tokens are extracted just to compare themagainst a string, which again requires memory management that isreally not needed.
I was wondering if extending O(U)String with functions like:

* getTokenAs[U]Int#(token, sep, index)
* matchToken(token, sep, index, string)
would be accepted/appreciated or not. At the moment I alreadysubmitted to gerrit a patch [1] which addscomphelper::string::matchToken but I think that adding suchfunctionality to OUString directly would be nicer. Also, introducinggetTokenAsInt in OUString would likely allow to reuse its toInt code.
Sounds a bit too special-purpose to be worth adding, IMO. Would thoseoptimizations really make a measurable difference?

I don't have real numbers to provide, but a very rough check on getTokenprovides the following numbers:


git grep -w getToken > getToken.txt
grep -wc getToken getToken.txt ==> 1646
grep -wc toInt32 getToken.txt ==> 218
grep -wc toInt64 getToken.txt ==> 8
grep -wc toUInt32 getToken.txt ==> 0
grep -wc toUInt64 getToken.txt ==> 8

The number of getToken occurrences is higher that realOUString::getToken calls (comments, header files, definitions and alsonot OUString getToken), and I am missing places in which conversion tointeger is done in a following line. As a result we have that thispattern is > 14.2% of all getToken occurrences. I cannot say if this isfrequently called code or not.

About matchToken, this seems to be a very less frequent pattern and atthe moment the comphelper approach can provide a viable approach, so Iwoulg go this way (and will take care of reviewing some older getTokenoptimizations that I implemented).

Also, a better approach overall would probably be some string_view-basedgetToken functionality (converting from an OUString to a string_view ischeap), and then string_view-based toInt etc. functions.

At the moment I plan to just go through all of getToken uses and do someminor local optimizations, then I might have a look at the string_viewapproach (unless previous numbers make the OUString one look not toospecialised).


Many thanks for your comments
Kind regards
Matteo

_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice

Context

Possible extensions to OUString class · Matteo Casalin
- Re: Possible extensions to OUString class · Stephan Bergmann
  - Re: Possible extensions to OUString class · Matteo Casalin
    - Re: Possible extensions to OUString class · Michael Stahl
  - Re: Possible extensions to OUString class · Kaganski Mike
    - Re: Possible extensions to OUString class · Stephan Bergmann

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.