Date: prev next · Thread: first prev next last
2017 Archives by date, by thread · List index


Hi moggi,

Originally, my vision was to use the data collected for the help of our
marketing team. They would get to know the category of people they
should aim for. They could do all kinds of data mining stuff to get
what's useful for them.

l10n team could also use this data to see where LO is gaining popularity
and where they should focus their efforts on. The dev team could also
make use of it. But I can't think of any at the moment.

And yes, users could be very conscious of the data they share with us.
Maybe, a polite dialog box explaining them how their privacy isn't
compromised can work. But, quite a few of them would still opt out.

On Tuesday 14 March 2017 08:21 PM, Markus Mohrhard wrote:
Hey Jaskaran,

On Thu, Mar 9, 2017 at 10:10 AM, Jaskaran Singh <jvsg1303@gmail.com
<mailto:jvsg1303@gmail.com>> wrote:

    Hi,

    Currently we collect user stats when someone downloads LO from our
    website. Now these may not be very useful since only very limited
    information is obtained by this method. Also, not everyone gets to
    participate in this because not everyone downloads LO. Some just get it
    preinstalled on their O.S while others get a copy through their friends.

    I believe it's important for us to know about our users as deeply as
    possible so as to make informed choices. The information which we should
    be looking for is:

    1. Operating System, word size and kernel version
    2. RAM and Cache amount
    3. CPU and GPU specs
    4. Opencl driver
    5. Display specs
    6. Country
    7. Default Language
    8. <anything_else?>

    Now, obviously this is sensitive information and most users would
    disagree to share it. So we could introduce a way to anonymously share
    this data. We could enable client to use a proxy to share this OR enable
    this data to be sent over Tor (Onion Router). But again, most users
    wouldn't want that.

    So I've found another way of doing this. Have a look at Rappor[1]. It
    introduces some random noise so that we are never sure of the data that
    client sends us. The statistics that we would get would be in terms of
    probability. For example, if a system has i3 processor, it will roll a
    dice to determine whether it should speak the truth or not. And by
    default we could have 80% (?) chance of speaking the truth. So if we get
    the data that user is running i3 processor, we are 80% sure that he/she
    is. And 20% chance that he/she is reporting wrong info. So aggregate
    that for a large number of users and we would get a rough trend.

    We could also share this data in the forms of numbers and graphs(and
    other representations) on our website.

    So this would work this way. Whenever someone installs or upgrades LO
    and starts LO for the first time, a dialog box appears asking for
    permission to share some data while also explaining how this would not
    compromise their privacy.

    I'd like to know your views on this. And I'd like to implement this if
    none of you want to. I may apply for this as a project in GSoC. So
    please inform me if you can be a mentor for this project.



So basically this requires an opt-in scheme instead of the opt-out that
you have in mind. Users are very sensitive when it comes to collecting
information that are perceived as personal. Based on that I think the
value might not be as big as you hoped. Currently the plan is to collect
info about the number of active users as part of the automatic update
but not much more. Similar to Tor I'm not so sure if I see the value in
having a huge collection of statistics that we are not planning to use.
Besides the obviously problem of privacy the bigger your data set the
more work you need to invest in processing the data.

Based on that it would help if you would provide some cases where having
such detailed statistics would help us improve LibreOffice.

Regards,
Markus

Regards,
Jaskaran


Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.