Hi guys,
Over in LibreOffice land, we're transitioning our whole build to use
gnumake - with the goal of having a single gnumake instance able to
re-build the (many thousands) of files we have, and store and act on all
of the dependencies.
Anyhow - one problem we are seeing is that as we load and parse the
~50Mb of dependencies that we need (for part of writer) we are statting
the same files involved in dependencies sometimes a thousand times or
so. We do around 700k stats with lots of duplication.
These ~all come from calling 'glob'; I append a patch that tries to
call glob only if needed - it could be done more prettily:
+ if (nlist != &name)
not the nicest thing in the world; but I didn't want a big indentation
change. Timings for a make -sr with nothing to do are:
before after
real 0m5.795s 0m2.634s
user 0m3.513s 0m2.526s
sys 0m2.274s 0m0.101s
Which is a worthwhile saving at least in our use case, though
naturally, being spectacularly incompetant - it is probably a
side-effect of me breaking everything ;-) Having said that, the
dependency rules (at least) appear to continue to work nicely when I
test with manual touching, and 'make check' passes ...
Thoughts much appreciated,
Thanks,
Michael.
diff --git a/read.c b/read.c
index a3ad88e..48de4fe 100644
--- a/read.c
+++ b/read.c
@@ -2824,6 +2824,20 @@ tilde_expand (const char *name)
#endif /* !VMS */
return 0;
}
+
+
+static int
+need_to_glob (const char *name)
+{
+ int i;
+ for (i = 0; name[i] != '\0'; i++) {
+ if (name[i] == '?' || name[i] == '*' || name[i] == '[') {
+ return 1;
+ }
+ }
+ return 0;
+}
+
/* Parse a string into a sequence of filenames represented as a chain of
struct nameseq's and return that chain. Optionally expand the strings via
@@ -3112,6 +3126,14 @@ parse_file_seq (char **stringp, unsigned int size, int stopchar,
}
#endif /* !NO_ARCHIVES */
+ /* glob is expensive - always stating, try to avoid it if possible */
+ if (!need_to_glob (name)) {
+ nlist = &name;
+ i = 1;
+ if (flags & PARSEFS_EXISTS && !file_exists_p (name))
+ i = 0;
+ }
+ else
switch (glob (name, GLOB_NOSORT|GLOB_ALTDIRFUNC, NULL, &gl))
{
case GLOB_NOSPACE:
@@ -3174,7 +3196,8 @@ parse_file_seq (char **stringp, unsigned int size, int stopchar,
#endif /* !NO_ARCHIVES */
NEWELT (concat (2, prefix, nlist[i]));
- globfree (&gl);
+ if (nlist != &name)
+ globfree (&gl);
#ifndef NO_ARCHIVES
if (arname)
--
michael.meeks@novell.com <><, Pseudo Engineer, itinerant idiot
Context
- [Libreoffice] substantial 'glob' speedup ... · Michael Meeks
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.