Hi Stephan,
On 1/30/19 10:40 PM, Stephan Bergmann wrote:
On 30/01/2019 22:17, Matteo Casalin wrote:
     I'm working on improving code that calls getToken (e.g. using its 
version with index, or using other OUString functions in its place 
when possible).
One thing that I noticed is that there are a lot of calls in the form 
getToken().toInt# which require memory management just to obtain a 
value that could be generated by the original OUString. Similarly (but 
less frequently), some tokens are extracted just to compare them 
against a string, which again requires memory management that is 
really not needed.
I was wondering if extending O(U)String with functions like:
* getTokenAs[U]Int#(token, sep, index)
* matchToken(token, sep, index, string)
would be accepted/appreciated or not. At the moment I already 
submitted to gerrit a patch [1] which adds 
comphelper::string::matchToken but I think that adding such 
functionality to OUString directly would be nicer. Also, introducing 
getTokenAsInt in OUString would likely allow to reuse its toInt code.
Sounds a bit too special-purpose to be worth adding, IMO.  Would those 
optimizations really make a measurable difference?
I don't have real numbers to provide, but a very rough check on getToken 
provides the following numbers:
git grep -w getToken > getToken.txt
grep -wc getToken getToken.txt ==> 1646
grep -wc toInt32 getToken.txt ==> 218
grep -wc toInt64 getToken.txt ==> 8
grep -wc toUInt32 getToken.txt ==> 0
grep -wc toUInt64 getToken.txt ==> 8
The number of getToken occurrences is higher that real 
OUString::getToken calls (comments, header files, definitions and also 
not OUString getToken), and I am missing places in which conversion to 
integer is done in a following line. As a result we have that this 
pattern is > 14.2% of all getToken occurrences. I cannot say if this is 
frequently called code or not.
About matchToken, this seems to be a very less frequent pattern and at 
the moment the comphelper approach can provide a viable approach, so I 
woulg go this way (and will take care of reviewing some older getToken 
optimizations that I implemented).
Also, a better approach overall would probably be some string_view-based 
getToken functionality (converting from an OUString to a string_view is 
cheap), and then string_view-based toInt etc. functions.
At the moment I plan to just go through all of getToken uses and do some 
minor local optimizations, then I might have a look at the string_view 
approach (unless previous numbers make the OUString one look not too 
specialised).
Many thanks for your comments
Kind regards
Matteo
_______________________________________________
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Context
   
 
  Privacy Policy |
  
Impressum (Legal Info) |
  
Copyright information: Unless otherwise specified, all text and images
  on this website are licensed under the
  
Creative Commons Attribution-Share Alike 3.0 License.
  This does not include the source code of LibreOffice, which is
  licensed under the Mozilla Public License (
MPLv2).
  "LibreOffice" and "The Document Foundation" are
  registered trademarks of their corresponding registered owners or are
  in actual use as trademarks in one or more countries. Their respective
  logos and icons are also subject to international copyright laws. Use
  thereof is explained in our 
trademark policy.