libebook filter sniffing cost ...

Michael Meeks <michael.meeks -AT- collabora.com>

Thu, 12 Dec 2013 10:02:57 +0000

Hi David & Fridrich, Just doing some load time profiling, and I notice that the libebook filter chews just under 3% of the load-time of (quite a large) XLSX file ;-) It seems the filter / sniffing / detection code there is particularly problematic. I wonder if we need something like this: git log -u -1 53138c9968e28a25a8cd6d2b5e3d31cbb3257852 To avoid thrashing the XStream read function ? we do 52k 'read' calls on the XStream which is really not a fast interface to use for small reads. http://people.freedesktop.org/~michael/sheet-profile.txt Has the profile there; compare EBookImportFilter::detect to framework::LoadEnv::startLoading. For thumbnailing we had a similar problem with reading strings improved but not fixed by: commit d67cd21033877c9c09d9cc4f14c2c4658e973f57 Author: Mathieu Parent <mathieu.parent@nantesmetropole.fr> Date: Mon Oct 14 22:23:05 2013 +0100 fdo#56007 - Read more bytes on Zip read (for thumbnails) Particularly on remote file-systems we'd do many remote calls here - which is really not ideal. I've pushed a small patch to avoid some of the more silly reallocing calling of: template< class E > inline void Sequence< E >::realloc( sal_Int32 nSize ) { const Type & rType = ::cppu::getTypeFavourUnsigned( this ); sal_Bool success = ::uno_type_sequence_realloc( &_pSequence, rType.getTypeLibType(), nSize, (uno_AcquireFunc)cpp_acquire, (uno_ReleaseFunc)cpp_release ); if (!success) throw ::std::bad_alloc(); } Un-conditionally even when the sequence is the same length seems particularly silly ;-) [ I assume that the WPXSvInputStream by keeping the sequence around should save that allocation & be quite efficient through a blizzard of identical sized reads anyhow ;-]. It makes me wonder whether the above should have a fast-past for pointless reallocs to the same size though. Thoughts appreciated though; is there some ordering of sniffing such that we can prioritize common formats over less common ones ? and has perhaps libebook got into that stack too high up ? ATB, Michael. -- michael.meeks@collabora.com <><, Pseudo Engineer, itinerant idiot

Context

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.