Hi,
I have been loosely looking at libreoffice and last night I thought I'd
take a look at an easy hack. I choose bug 41738.
The problem seems to be with the regular expression compilation code;
especially he handling of '[:'-character classes in combination with
escapes.
In gdb, I can see that the the string descibed in the bug ( foo[^
\[:alpha:\]] ) seems to be always in the reclass.cpp:1148 -
for(;;)-loop .
We get into
1189 else if (c == (sal_Unicode)':' && p[-2] == (sal_Unicode)'[') {
and as the closing ']' is escaped ( is should'd be, shouldn't it ? What
does the standard say about this ? ), the :] doesn't match and p is
reset to be p1 on line 1257.
If it is legal to have an escape inside the character class, then it
seems one would have to do something about the loop around
reclass.cxx:1202.
In any case, it would seem wise to not allow the infinite loop for a
malformed pattern; this could be done by incrementing p1 after p has
been reset to it the first time, as we will not reset to it again in a
valid pattern.
- Karl
--- a/regexp/source/reclass.cxx
+++ b/regexp/source/reclass.cxx
@@ -1255,6 +1255,7 @@ Regexpr::regex_compile()
break;
} else {
p = p1+1;
+ p1 ++;
last_char = (sal_Unicode)':';
set_list_bit(last_char, b);
}
Context
- [Libreoffice] Easy hack regex compile infinite loop ( bug 41738 ) · Karl Koehler
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.