Re: [libreoffice-l10n] Pootle migration

Dwayne Bailey <dwayne -AT- translate.org.za>
Wed, 06 Apr 2011 09:58:27 +0200

Sent to the list on behalf of Friedel:

Hi Christian, everybody

Please CC me on any replies as I'm not on the list.

I'm one of the developers in the Translate project, and have been trying
to help Rimas a bit with this deployment of Pootle. Thanks for your work
on the server for Pootle!  Please allow me a few comments:

Background:
Pootle simply isn't coded to run in small 16MB processes. It is a full
featured web application written on a heavy framework (Django) in a
programming language that isn't very frugal with memory use (Python). We
didn't specifically optimise for memory use over other things when we
programmed Pootle, and Libreoffice is running a system with over 5000
files for the libo34x_ui project alone. When looking at help, the files
are very big by any standard in the world of FOSS. Of course, the help
has many of these big files per language. This is all fine. Pootle runs
fine with loads like this, as (obviously) visible from the OOo project.

The rest of my comments are inline...

> -------- Original Message --------
>                           Subject:
> Re: [libreoffice-l10n] Pootle
> migration
>                              Date:
> Mon, 4 Apr 2011 22:28:43 +0200
>                              From:
> Christian Lohmaier
>
>
>
>
Hi *,
>
> On Mon, Apr 4, 2011 at 2:58 PM, Rimas Kudelis <rq@akl.lt> wrote:
> >
> > Stuff to consider: Pootle may run slower on this machine, and we may

> > experience other problems, at least for now. I wasn't able toconvince the

> > admin that Pootle needs more resources, so we have what we have.
>

> Again, as you apparently still don't understand what I already wrotemany times:> * Adding more resources will /NOT/ make pootle run faster than itdoes now.

> The VM already has way more resources assigned than necessary. It is
> /idle/ almost all the time.

As far as I know the server hasn't really been used yet, so I guess
we'll be collecting data from now on to see how things go. During the
setup of the server, we make tradeoffs between performance and memory
use. If there is no memory available, we'll obviously try to optimise at
all cost for minimising memory use, and that is what I understand that
Rimas said: things might be slower than necessary, since we are not
optimising for performance, but for memory use.



> * The only thing that is slow (when executed the first time) is
> generation of the zips. So when you as translator request a zip: Don't
> click the link multiple times because you don't immediately get the
> zip. It can take 10 seconds for the files to be generated. Again:
> * Adding more resources will /not/ make that time shorter. It is a
> single-threaded process that can only uses one single CPU, thus
> assigning more CPUs won't help at all (the VM has 4CPUs assigned
> already)
> Requesting that same zip another time (or different zips of the
> project belonging to the same language is fast/instant, but requesting
> the zip for another language again may take some seconds for the first
> request (or again after the files did change in between).
> * Pootle has a memory leak when creating the zips. It won't release
> memory after processing the files.
> This would be the only time where the assigned resources may run out
> (the VM has 1GB or RAM assigned): Multiple different languages request
> the zip at the same time. Then memory usage increases, memory runs out
> and either it is crawling along or the process gets killed.

Some stuff that is slow to load is cached for later use. This is done
for performance optimisation. This is one of the reasons you won't see
the memory use go down immediately after generating a ZIP file. Another
reason is the way the garbage collector works in Python.

Deciding to cache something is a tradeoff. So we can disable or minimize
some of the caching, which will simply make a few things slower,
hopefully not by much, but we're guessing while the server hasn't been
used much yet.

I suggested some customisations to the parse pool (to do exactly this).
That affects the number of cached files and search indexes, both of
which are very large on your server.



> * I will NOT assign RAM to a VM (and thus block that ram for other
> use) to satisfy a memory leak, when that RAM is unused 99% of the
> time.

I believe what you are seeing is the caching, not a memory leak.

We haven't seen the server used much yet. My educated guess from having
worked on a few Pootle installations is that the RAM isn't enough, but
let's keep an eye on things and see how it goes. I assumed we'll want a
nice fast server supporting several concurrent users during the build-up
to a release, but we can still tune things down a bit more, I guess.


> * The effects of the memory leak can be nullified by just restarting
> the worker processes more frequently. Thus again:

...at the cost of making things slower, since more stuff needs to be
loaded in memory afresh every time you restart a process.


> * Adding more resources will /NOT/ make the VM run faster

It most probably will, since we are sacrificing performance to minimise
memory use. For example: we opted for more threads, rather than
processes, that is known not to perform as well in Python, especially
for CPU intensive tasks.


>  it will
> /NOT/ allow it to handle more requests

It most probably will, since we reduced the number of processes to
minimise memory use, and slower serving of requests necessarily affects
the number of requests you can serve in any given time.


> Pootle is idling almost all of the time. There is less than one apache
> request per second on average (and for regular requests (i.e.
> non-"generate-a-zip" actions) it can easily serve >>50 simultaneous
> requests per second.)

Let's keep an eye on things when people actually start to use the
server.


> > Maybe if we

> > manage to give enough load to the server, he'll change his mind (orwe'll

> > find other ways to deal with the problem).
>
> No, I won't change my mind, but depending on the load/effects of the
> memory leak I'll reduce lifetime of the server-processes further.

I hope you will be reasonable and look at the data as it becomes
available, and at least consider changing your mind. As mentioned, I'm
pretty sure there is no memory leak. If you reduce the lifetime of the
server processes, you are just making performance worse, which is all
that Rimas warned the users for.


> Again:
> The only time where pootle is "slow" is:
> * Creation of zips for the first time / after files have changed.
> This is CPU intensive process, and the CPU cannot be made faster by
> assigning more resources. Live with it. Redesign pootle to use another
> method (i.e. a multithreaded one that can use multiple CPUs at once)
> or whatever, but this one problem is not solvable by assigning more
> resources.

I agree. Generating the ZIP files is slow. Doing it multithreaded, will
limit the performance for more users while doing that.


> * This also is only noteworthy when the files are big/numerous.

Which is the norm on this server, unfortunately.


> * Requesting the other zips in the same project & language is fast.
>
> So there is no point in clicking all zip-URLs on a page at once (on
> the contrary, than the request will all cause CPU to be burnt, while
> all that is only needed one single time). If you want to download more
> than one zip of a page, click the first one, wait until it is
> generated and handed over to your browser, then click as many others
> on the same page and get all of them quickly.

Yes, we have optimised for several cases here that are likely, and I
suggested some workarounds for some of the issues we're likely to hit
with the little bit of RAM as Rimas has already started doing, as far as
I know.


> * Any other time, doing in-place-translation, just browsing along is
> supposed to be fast.

... as long as we're not hitting the current imposed limits of
concurrency.


Let's see how well we can make this work.

Keep well
Friedel



--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/better-lies-about-gnome-localisation

--
Unsubscribe instructions: E-mail to l10n+help@libreoffice.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/www/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Context

Re: [libreoffice-l10n] Pootle migration (continued)

Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.