Hi David & Fridrich,
Just doing some load time profiling, and I notice that the libebook
filter chews just under 3% of the load-time of (quite a large) XLSX
file ;-)
It seems the filter / sniffing / detection code there is particularly
problematic. I wonder if we need something like this:
git log -u -1 53138c9968e28a25a8cd6d2b5e3d31cbb3257852
To avoid thrashing the XStream read function ? we do 52k 'read' calls
on the XStream which is really not a fast interface to use for small
reads.
http://people.freedesktop.org/~michael/sheet-profile.txt
Has the profile there; compare EBookImportFilter::detect to
framework::LoadEnv::startLoading.
For thumbnailing we had a similar problem with reading strings improved
but not fixed by:
commit d67cd21033877c9c09d9cc4f14c2c4658e973f57
Author: Mathieu Parent <mathieu.parent@nantesmetropole.fr>
Date: Mon Oct 14 22:23:05 2013 +0100
fdo#56007 - Read more bytes on Zip read (for thumbnails)
Particularly on remote file-systems we'd do many remote calls here -
which is really not ideal.
I've pushed a small patch to avoid some of the more silly reallocing
calling of:
template< class E >
inline void Sequence< E >::realloc( sal_Int32 nSize )
{
const Type & rType = ::cppu::getTypeFavourUnsigned( this );
sal_Bool success =
::uno_type_sequence_realloc(
&_pSequence, rType.getTypeLibType(), nSize,
(uno_AcquireFunc)cpp_acquire, (uno_ReleaseFunc)cpp_release );
if (!success)
throw ::std::bad_alloc();
}
Un-conditionally even when the sequence is the same length seems
particularly silly ;-) [ I assume that the WPXSvInputStream by keeping
the sequence around should save that allocation & be quite efficient
through a blizzard of identical sized reads anyhow ;-].
It makes me wonder whether the above should have a fast-past for
pointless reallocs to the same size though.
Thoughts appreciated though; is there some ordering of sniffing such
that we can prioritize common formats over less common ones ? and has
perhaps libebook got into that stack too high up ?
ATB,
Michael.
--
michael.meeks@collabora.com <><, Pseudo Engineer, itinerant idiot
Context
- libebook filter sniffing cost ... · Michael Meeks
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.