Re: [Libreoffice] [Crazy Ideas] Discuss "Replace regexp parser with std library"

Thorsten Behrens <thb -AT- documentfoundation.org>
Mon, 29 Nov 2010 14:22:55 +0100

Joe Smith wrote:

I've looked at the code a bit, and it seems like there is indeed only one point
of contact with the rest of the suite, textsearch.cxx, which handles all types
of text searches (normal, regexp & fuzzy), and calls Regexpr::re_search(), which
calls re_match2() to run the actual regexp match.

So the structure makes it easy to replace the regexp code in one place.

Unfortunately, the way the functions work does not match well with the Boost RE
classes, although I'm sure it would be possible with an interface layer.

For example, the Boost engine handles locale-specific issues internally, whereas
OOo's engine knows almost nothing about character case or multi-character
sequences. Instead, it preps the text to be searched by running it through a
filter. I don't understand the i18n & character encoding issues well enough to
guess what that filter is actually doing or how it should be handled.

Hi Joe,

hm - then I think a combination of those two approaches might be a
winning strategy - LibO uses icu for all those nifty transliteration
stuff & what not.

I notice that newer boost versions also optionally support icu,
maybe that already gives us good enough coverage - I'd be tempted to
just give it a whirl, and add it as an optional, experimental
feature to have people play with it.

Cheers,

-- Thorsten

Attachment: pgpCuAJpUxlkj.pgp
Description: PGP signature

Context

[Libreoffice] [Crazy Ideas] Discuss "Replace regexp parser with std library" · Joe Smith
- Re: [Libreoffice] [Crazy Ideas] Discuss "Replace regexp parser with std library" · Thorsten Behrens
- Re: [Libreoffice] [Crazy Ideas] Discuss · John LeMoyne Castle
  - Re: [Libreoffice] [Crazy Ideas] Discuss · Joe Smith
    - Re: [Libreoffice] [Crazy Ideas] Discuss · Mattias Johnsson
      - Re: [Libreoffice] [Crazy Ideas] Discuss · Kohei Yoshida
        
        Re: [Libreoffice] [Crazy Ideas] Discuss · David Tardon
      - Re: [Libreoffice] [Crazy Ideas] Discuss · Joe Smith

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.