Date: prev next · Thread: first prev next last
2019 Archives by date, by thread · List index


On 13.12.2019 05:43, Luboš Luňák wrote:
On Friday 13 of December 2019, Kohei Yoshida wrote:
I just finished my benchmark testing on mdds::multi_type_vector, and
summarized my results in this blog post:

http://kohei.us/2019/12/12/benchmark-results-on-mdds-multi_type_vector/

Hopefully my findings and intepretations make sense.  In short, the
numbers look great.  The overhead of block shifting is a concern, but
I'm optimistic that this is going to be a non-issue for the most part.

I'd really like to see benchmarks of Calc with this new mdds, especially to see how many regressions there will be, as I'm concerned whether it really
would be worth it in reality.

Sure, I do share your concern, which is why I spent time designing and implementing the benchmark I did so that I can get some answers for my concern.

 You say that the vast majority of Calc
performance problems are with updating cell values without shifting, but that
makes sense because that's where the current bottleneck is. Once the
bottleneck moves to shifting of cells, we may get a whole new slew of
bugreports about that.

Sure, but that's just as much of a speculation as my own interpretation. To be fair, it is possible that you are right, and I am wrong. But I did provide my own interpretations of those numbers based on my own experience and educated guesses. I'm not claiming that I'm right, but I'm claiming that what I concluded in my post is my truly honest, hopefully reasonably researched opinions.

E.g. copy&paste of a column is very likely to hit a
problem there, IIRC it internally results in a lot of shifting of cells.

Yes, which is why I ran the benchmarks to get some numbers to get more clarity.


One interpretation of the graphs may be that the change helps a lot at the cost of a regression in one place, but other possible interpretation is that the change brings an improvement that can already be mostly achieved using hints at the expense of a cost that cannot be alleviated. Moreover we did go over all the reported performance problems related to mdds some months back and fixed all of them (at least I'm not aware of any pending ones). So the real question for me is how many of real-world cases will be improved and worsened by this, which is why I'd like to see non-artifical benchmarks.

So, I'm a bit concerned about your use of the word "artificial" to describe my benchmark, because that word implies that I somehow made those numbers up. Those are real numbers. Now, the numbers will of course be quite different if you measure the entire Calc operations which include a whole bunch of other operations, and I believe this is what you are alluding to. I do share your concern there. But I thought it was reasonable to draw the conclusions that I did, given that the I/O with mdds::multi_type_vector do constitute a large part of Calc's cell I/O's. Also, keep in mind that the rest of the Calc operations are constant, and the only variable is the mdds portion. On this point, I believe it's not unreasonable to draw *some* conclusions based on the numbers on mdds alone.

Having said that, you are of course free to draw your own, different conclusions.

BTW, I have you considered using vector operations like SSE for the updates
(either checking whether the compiler can employ them automatically or
hand-writing them)?

Yes. For one, I did look into e.g. OpenMP's auto SIMD support. But its support appeared to be very limited, and MSVC did not seem to support it. I also thought about hand-writing SIMD directly, and I am still considering that as one of my future possibilities (note that I'm not entirely done with this work). But I couldn't think of a good one to use, especially when multi_type_vector uses array of structures (AoS). SIMD intrinsics I know of are mostly not suitable for AoS. If you know of good SIMD instinsics that may work for multi_type_vector, I would be interested.

I've done some SIMD coding in orcus to speed up XML and JSON parsing, but I can't say I'm expert at it, and I did not always manage to get the code to run faster with SIMD.

Alright, since now one person is raising objection on hastily integrating this piece, I should hold on to integrating this piece for now, and let the discussion continue.

And, while I would love to craft another benchmark test involving the entire Calc piece, I'm afraid I won't have enough bandwidth to do that. Even running this benchmark on mdds alone took me one month to do it end-to-end. It would be nice to have someone else chip in and conduct another, more through and satisfactory benchmark test, if anybody is interested.

Thanks,

Kohei

--
Kohei Yoshida, LibreOffice Calc volunteer hacker

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.