Date: prev next · Thread: first prev next last
2012 Archives by date, by thread · List index

On 11/08/2012 03:43 AM, Italo Vignoli wrote:
Il 08/11/2012 07:08, Alex Thurgood ha scritto:

In other words, at least for download/user stats, the answer is "no",
and for the other points Rob mentions, obtaining raw data of any
significance is for the git expert.

Downloads are extracted from the mirrors, and there is a script for that. GIT is for development related figures.

Thanks, I'll check it out, but basically what you are saying, if I
understand correctly, is that the data in question is provided in a
format which is not necessarily comparable to that which Rob has used
for AOO, and thus a certain amount of internal interpretations,
assumptions, etc are made by the LO project to arrive at its view of the
data. Are these methods/assumptions, as used by the LO project,
publicly documented on the LO wiki ?

Our data are in a simple format (sum of units), while Mr Rob Weir is using complicated interpretations to hide the truth, which is that the developers and the community are with LO and not with Apache OO.

There is no interpretation and assumptions in our data: the number of developers is a sum of individual developers, the number of commits is a sum of single commits, and so on.

The number of community members has never been calculated using wiki subscribers (in this case we estimate around 1,000 contributors), and Mr Rob Weir has just got that number because it could be argued.

The number of community members is estimated using global + local mailing lists (many people are subscribed only to mailing lists in their native language) + wiki contributors + developer numbers, etcetera.

So, being the method that we use a simple sum of data (and this should be easy to understand by looking at the charts published on a monthly basis), I do not think that we have to document such a methodology.

The number of users is estimated (and the term "estimated" has always been associated to it). Of course, any estimate might be right or might be wrong, according to the point of view.

Apache OO has a higher number of downloads, of course, but I wonder - for instance - if users who were previously used to get the software in their native language are as happy as in the past when have discovered - after having downloaded the software - that the software is not available in their language).

By using this metrics, for instance, it would be possible to reduce Apache OO download numbers at least by one third (but maybe even more), because you could easily cut downloads in countries where the software is not available in the native language (version 3.4 was not even available in British English).

Bus, as we are not Mr Rob Weir - and having him as an opponent is a blessing (please ask Microsoft) - we are not going to embark in such a useless calculation.

Apache OO is available in 20 languages, and they are currently adding Danish and Norwegian (but many major languages are missing).

LibreOffice is available in over 100 languages (over 95% of the world population), and the community is now working at Filipino/Tagalog or other minor languages.

Number of languages available is a simple measure of community numbers (although estimated, because many people involved in localization do not show up in maling lists) but of course Mr Rob Weir is not looking for simple measures because they can be understood by everyone, and by using obscure measurements he does try to obfuscate the reality.

Best regards, Italo


If I understand the problem, we can reliably count certain items (code commits, number of downloads, etc.) while other statistics are estimates based on the data. The problem is that the statistics, while publicly interesting for many reasons, are more useful internally and what is probably more important the raw numbers is their growth/decrease with time to determine the health of LO. If downloads, subscribers, developers, commits, are increasing with time that indicates LO is healthy. Decreases in specific areas would indicate areas where the community or TDF would need to improve.

IMHO, Rob Wier is committing a common problem with statistics and marketing data which is the data is often imprecise but when looked at over time can tell a very interesting story. For example, each download does not represent a new user but monthly increases would indicate the user base is growing. Also, every marketing expert I have ever known says it easier and cheaper to keep an established customer/user than to get a new one so growing monthly downloads would indicate to me that our marketing efforts are having some positive effect.

Jay Lozier

Unsubscribe instructions: E-mail to
Posting guidelines + more:
List archive:
All messages sent to this list will be publicly archived and cannot be deleted


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.