Date: prev next · Thread: first prev next last
2017 Archives by date, by thread · List index


Hey Jaskaran,

On Thu, Mar 9, 2017 at 10:10 AM, Jaskaran Singh <jvsg1303@gmail.com> wrote:

Hi,

Currently we collect user stats when someone downloads LO from our
website. Now these may not be very useful since only very limited
information is obtained by this method. Also, not everyone gets to
participate in this because not everyone downloads LO. Some just get it
preinstalled on their O.S while others get a copy through their friends.

I believe it's important for us to know about our users as deeply as
possible so as to make informed choices. The information which we should
be looking for is:

1. Operating System, word size and kernel version
2. RAM and Cache amount
3. CPU and GPU specs
4. Opencl driver
5. Display specs
6. Country
7. Default Language
8. <anything_else?>

Now, obviously this is sensitive information and most users would
disagree to share it. So we could introduce a way to anonymously share
this data. We could enable client to use a proxy to share this OR enable
this data to be sent over Tor (Onion Router). But again, most users
wouldn't want that.

So I've found another way of doing this. Have a look at Rappor[1]. It
introduces some random noise so that we are never sure of the data that
client sends us. The statistics that we would get would be in terms of
probability. For example, if a system has i3 processor, it will roll a
dice to determine whether it should speak the truth or not. And by
default we could have 80% (?) chance of speaking the truth. So if we get
the data that user is running i3 processor, we are 80% sure that he/she
is. And 20% chance that he/she is reporting wrong info. So aggregate
that for a large number of users and we would get a rough trend.

We could also share this data in the forms of numbers and graphs(and
other representations) on our website.

So this would work this way. Whenever someone installs or upgrades LO
and starts LO for the first time, a dialog box appears asking for
permission to share some data while also explaining how this would not
compromise their privacy.

I'd like to know your views on this. And I'd like to implement this if
none of you want to. I may apply for this as a project in GSoC. So
please inform me if you can be a mentor for this project.



So basically this requires an opt-in scheme instead of the opt-out that you
have in mind. Users are very sensitive when it comes to collecting
information that are perceived as personal. Based on that I think the value
might not be as big as you hoped. Currently the plan is to collect info
about the number of active users as part of the automatic update but not
much more. Similar to Tor I'm not so sure if I see the value in having a
huge collection of statistics that we are not planning to use. Besides the
obviously problem of privacy the bigger your data set the more work you
need to invest in processing the data.

Based on that it would help if you would provide some cases where having
such detailed statistics would help us improve LibreOffice.

Regards,
Markus

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.