Hi Kohei, hi all,
I would like to report a huge memory consumption in Calc.
The issue happens with a spreadsheet containing a very high number of
cells, when computing subtotals.
My document is a spreadsheet with about 100 columns and 100,000 lines. I
am trying to compute a single subtotal on column AA (numeric type using
sum function) and grouping on a single text column (lets say H).
After loading the document, LibreOffice is using about 1.7gig or memory.
When running the subtotal function, LibreOffice allocate more than 20
extra gigs of ram, and processing is really really slow.
After analyzing the source code, it seems that the problem is located
into the MDDS template library. The FormulaCells are stored into
multitype vectors which are resized during processing (size is
decreasing as long as new cells are instantiated and some internal
vector blocks are split into several objects). (call comes from
ScFormulaCell* ScColumn::SetFormulaCell from column3.cxx)
Unfortunately it relies on a Vector object from STL, which does not free
its memory when resized to a smaller size (that sounds like an
optimization to be able to 'regrow' fast without allocating memory from
the OS).
This could be fine, but the problem is that these objects are
initialized to a size of number of lines (a bit more than 100 000 in my
example) then resized to 1. Since the memory is not freed, it holds
about 800 000 byte each time (8 byte * sizeof(double)).
For this kind of algorithm it is really not efficient, since each vector
resize is allocating something like 800kb of extra memory which are not
released until document is closed. Multiply this by the number of time
the processing loop iterates, it reaches gigs of RAM pretty fast :)
Even if it may look like a memory leak, it is not really one since the
memory will be released after the document is closed. The problem exist
on recent versions of LO, including master.
I attach to this bug entry a proposal for a patch which solve this
problem. A call to shrink_to_fit has been added in the resize_block
method. In order to limit the number of call to this method, and wasting
too much time releasing memory, i only call it when its current size is
half of its capacity (real number of element vs number of element
allocated).
Cheers
W.
--
William BONNET
Directeur Technique / CTO LINAGORA
Linagora 80 rue Roque de Fillol / Puteaux 92800 F
Tél. +33 (0)810 251 251
GSM +33 (0)689 376 977
Twitter @wbonnet
http://www.linagora.com/ | http://www.08000linux.com/
Découvrez OBM, La messagerie Libre : http://www.obm.org/
La présente transmission contient des informations confidentielles appartenant à Linagora,
exclusivement destinées au(x) destinataire(s) identifié(s) ci-dessus. Si vous n'en faites pas
partie, toute reproduction, distribution ou divulgation de tout ou partie des informations de cette
transmission, ou toute action effectuée sur la base de celles-ci vous sont formellement interdites.
Si vous avez reçu cette transmission par erreur, nous vous remercions de nous en avertir et de la
détruire de votre système d'information.
The present transmission contains privileged and confidential information belonging to Linagora,
exclusively intended for the recipient(s) thereabove identified. If you are not one of these
aforementioned recipients, any reproduction, distribution, disclosure of said information in whole
or in part, as well as any action undertaken on the basis of said information are strictly
prohibited. If you received the present transmission by mistake, please inform us and destroy it
from your messenging and information systems.
--- multi_type_vector_types.hpp.orig 2014-05-24 13:53:09.482797120 +0200
+++ multi_type_vector_types.hpp 2014-05-24 13:55:39.049154637 +0200
@@ -249,6 +249,12 @@
static void resize_block(base_element_block& blk, size_t new_size)
{
static_cast<_Self&>(blk).m_array.resize(new_size);
+ // Test if the vector have allocated capacity (thus memory) superior to
+ // twice its current size. If yes thus, shrink its memory footprint
+ // Vector from STL does not free its memory when its downsized
+ if (static_cast<_Self&>(blk).m_array.capacity() > (2 * new_size)) {
+ static_cast<_Self&>(blk).m_array.shrink_to_fit();
+ }
}
#ifdef MDDS_UNIT_TEST
Context
- Patch to huge memory consumption in LO Calc · William Bonnet
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.