Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index


Hi Kohei, hi all,

I would like to report a huge memory consumption in Calc.

The issue happens with a spreadsheet containing a very high number of cells, when computing subtotals.

My document is a spreadsheet with about 100 columns and 100,000 lines. I am trying to compute a single subtotal on column AA (numeric type using sum function) and grouping on a single text column (lets say H).

After loading the document, LibreOffice is using about 1.7gig or memory. When running the subtotal function, LibreOffice allocate more than 20 extra gigs of ram, and processing is really really slow.

After analyzing the source code, it seems that the problem is located into the MDDS template library. The FormulaCells are stored into multitype vectors which are resized during processing (size is decreasing as long as new cells are instantiated and some internal vector blocks are split into several objects). (call comes from ScFormulaCell* ScColumn::SetFormulaCell from column3.cxx)

Unfortunately it relies on a Vector object from STL, which does not free its memory when resized to a smaller size (that sounds like an optimization to be able to 'regrow' fast without allocating memory from the OS).

This could be fine, but the problem is that these objects are initialized to a size of number of lines (a bit more than 100 000 in my example) then resized to 1. Since the memory is not freed, it holds about 800 000 byte each time (8 byte * sizeof(double)).

For this kind of algorithm it is really not efficient, since each vector resize is allocating something like 800kb of extra memory which are not released until document is closed. Multiply this by the number of time the processing loop iterates, it reaches gigs of RAM pretty fast :)

Even if it may look like a memory leak, it is not really one since the memory will be released after the document is closed. The problem exist on recent versions of LO, including master.

I attach to this bug entry a proposal for a patch which solve this problem. A call to shrink_to_fit has been added in the resize_block method. In order to limit the number of call to this method, and wasting too much time releasing memory, i only call it when its current size is half of its capacity (real number of element vs number of element allocated).

Cheers
W.

--
William BONNET
Directeur Technique / CTO LINAGORA
Linagora 80 rue Roque de Fillol / Puteaux 92800 F
Tél. +33 (0)810 251 251
GSM +33 (0)689 376 977
Twitter @wbonnet

http://www.linagora.com/ | http://www.08000linux.com/

Découvrez OBM, La messagerie Libre : http://www.obm.org/

La présente transmission contient des informations confidentielles appartenant à Linagora, 
exclusivement destinées au(x) destinataire(s) identifié(s) ci-dessus. Si vous n'en faites pas 
partie, toute reproduction, distribution ou divulgation de tout ou partie des informations de cette 
transmission, ou toute action effectuée sur la base de celles-ci vous sont formellement interdites. 
Si vous avez reçu cette transmission par erreur, nous vous remercions de nous en avertir et de la 
détruire de votre système d'information.

The present transmission contains privileged and confidential information belonging to Linagora, 
exclusively intended for the recipient(s) thereabove identified. If you are not one of these 
aforementioned recipients, any reproduction, distribution, disclosure of said information in whole 
or in part, as well as any action undertaken on the basis of said information are strictly 
prohibited. If you received the present transmission by mistake, please inform us and destroy it 
from your messenging and information systems.

--- multi_type_vector_types.hpp.orig    2014-05-24 13:53:09.482797120 +0200
+++ multi_type_vector_types.hpp 2014-05-24 13:55:39.049154637 +0200
@@ -249,6 +249,12 @@
     static void resize_block(base_element_block& blk, size_t new_size)
     {
         static_cast<_Self&>(blk).m_array.resize(new_size);
+        // Test if the vector have allocated capacity (thus memory) superior to 
+        // twice its current size. If yes thus, shrink its memory footprint
+        // Vector from STL does not free its memory when its downsized
+        if (static_cast<_Self&>(blk).m_array.capacity() > (2 * new_size)) {
+            static_cast<_Self&>(blk).m_array.shrink_to_fit();
+        }
     }
 
 #ifdef MDDS_UNIT_TEST

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.