Anyone interested in discussing this "crazy idea"?
Replace home-grown regexp parser with some std library
http://wiki.documentfoundation.org/Development/Crazy_Ideas#Replace_home-grown_regexp_parser_with_some_std_library
I've been thinking about this since I found my first bug in OOo's oddball regex
engine two years ago. Even though it's a feature for die-hard geeks, I would
love to see OOo's quirky, complicated and non-standard regex engine replaced
with something solid, standard and externally supported.
I've looked at the code a bit, and it seems like there is indeed only one point
of contact with the rest of the suite, textsearch.cxx, which handles all types
of text searches (normal, regexp & fuzzy), and calls Regexpr::re_search(), which
calls re_match2() to run the actual regexp match.
So the structure makes it easy to replace the regexp code in one place.
Unfortunately, the way the functions work does not match well with the Boost RE
classes, although I'm sure it would be possible with an interface layer.
For example, the Boost engine handles locale-specific issues internally, whereas
OOo's engine knows almost nothing about character case or multi-character
sequences. Instead, it preps the text to be searched by running it through a
filter. I don't understand the i18n & character encoding issues well enough to
guess what that filter is actually doing or how it should be handled.
That's as far as I've gotten, although I have some ideas for some prototype
code. I'd love to get some input from someone more experienced with OOo's code,
or even to discuss how the regexp support fits at the application level.
<Joe
Context
- [Libreoffice] [Crazy Ideas] Discuss "Replace regexp parser with std library" · Joe Smith
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.