Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index


Thanks Norbert for the detailed reply. Some comments below.

On Mon, Oct 26, 2015 at 5:28 PM, Norbert Thiebaud <nthiebaud@gmail.com>
wrote:

On Mon, Oct 26, 2015 at 2:56 PM, Ashod Nakashian <ashnakash@gmail.com>
wrote:
On Mon, Oct 26, 2015 at 2:21 PM, Norbert Thiebaud <nthiebaud@gmail.com>
wrote:

On Mon, Oct 26, 2015 at 1:00 PM, Ashod Nakashian <ashnakash@gmail.com>
wrote:
On Mon, Oct 26, 2015 at 1:35 PM, Norbert Thiebaud <
nthiebaud@gmail.com>
wrote:

On Mon, Oct 26, 2015 at 12:14 PM, Ashod Nakashian <
ashnakash@gmail.com>
wrote:
OSL provides atomic helpers (osl_atomic_xxx) in the form of a GNU
builtin
(where available) or a platform-specific implementation.

Any reason for not using modern std::atomic (besides possible lack
of
volunteers) ?


As a transitional phase, we can maintain the same interface but
with
std:atomic as the implementation.

Thoughts?

osl atomic are c interface, used in c-source...

Thanks. Is there equivalent used in C++ ? (osl atomics only work for
sal_Int32 values, which is another potential issue for 64-bit
portability.)

the c++ code use these too.


Would there be support for using std::atomic in C++ code?

There is a case to be made in terms of performance if nothing else (in
some
scenarios they are hotspots, according to my profiler).

I seriously doubt that std:: will improve the performance over
__sync_add_and_fetch((p), 1)
and
__sync_sub_and_fetch((p), 1)

Agreed, it will not. But on non-gcc it will.


and fro windows, the only real gain would be to move the
implementation in include/osl
with
#define osl_atomic_increment(p) InterlockedIncrement(pCount)
and
#define osl_atomic_decrement(p) InterlockedDecrement(pCount)

that will give you most it not all of the gain.


Unfortunately, on Windows, and unlike gcc, the overhead is significant.
Ideally, the code would generate a single `lock xadd` instruction.
Currently the overhead of dispatching via function calls is very large
compared to this single instruction at the heart of the call.

Your suggestion is a good starting point and makes the need for std::atomic
less.


(Note: I did not mess with Windows back then when I did that for gcc,
as I was not in a position to test it properly,
nor did I have the inclination to mess with Windows in general, as
long as I can avoid it.
but really that should be fairly easy to do)




relying on atomic on 64 bits is going to be a problem as long as we
support 32 bits OS.


I believe most modern hardware support atomic operations on wide words
(i.e.
64-bit even when running in 32-bit mode).

yes, but bear in mode that we had to rollback patches that required
SSE2 on windows...
The hardware baseline is quite old...


I understand the concern.


In any case see below, osl only provide atomic increment/decrement for
sal_uInt32 explicitly.
If there is a need for atomic over another type of data, that will
involve something else than sal/osl


This is less of a concern. Agreed.



Note also that the osl API is a published API... that is why, although
internally on gcc/clang platfrom we use
the built-in directly via macro, the entry point in osl is maintained
for API compatibility purpose.


Understood.




and mostly these atomic are used to ref-count... and there is really
no reasonable need to have 64 bits ref-count is there ?


True for ref-counting. Not so for compare-exchange obviously (but I don't
know if these are used and how much).

osl does not implement/expose any compare-and-swap api AFAIK.
And honestly considering the horror of your locking 'model', having
such CAS api would be pretty silly.


Not sure which horror you are referring to (surely you meant 'our', for the
collective codebase).
I'm only suggesting to improve locking API, not change any specific thread
synchronization code at all.

I'll run some test with
#define osl_atomic_increment(p) InterlockedIncrement(pCount)
and
#define osl_atomic_decrement(p) InterlockedDecrement(pCount)

on Windows and if it improves things in the profiler, I'll submit a patch
for consideration.
I think it's a trivial change that can have some improvement in performance
(however small) without much risk.

Thanks again, appreciate the exchange.

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.