Date: prev next · Thread: first prev next last
2019 Archives by date, by thread · List index


On 09/12/19 19:14, Aditya Parameswaran wrote:
           The idea of converting to SQL queries is an interesting one
    but I find
    it very hard to believe it would provide any performance advantage at
    the same memory footprint. Furthermore - I'd be interested to know how
    you do other spreadsheet operations: row & column insertion, addressing,
    and dependency work on top of a SQL database with any efficiency.


We started by having the relational database be a simple persistent
storage layer, when coupled with an index to retrieve data by position,
can allow us to scroll through large datasets of billions of rows at
ease. We developed a new positional index to handle insertions and
deletions in O(log(n)) -- https://arxiv.org/pdf/1708.06712.pdf. I agree
that pushing the computation to the relational database does have
overheads; but at the same time, it allows for scaling to arbitrarily
large datasets. 

"the quickest way to optimise database access is to ditch first normal
form".

A provocative statement I know, but I'm very much in the NoSQL camp. I
can hunt up the details of a face-off between Oracle and Cache, where
Oracle had to "cheat" to achieve 100K tpm (things like deferring index
updates) whereas Cache blasted through 250K barely breaking a sweat ...
(or it might well have been tps)

The maths supports this ...

That said, a spreadsheet is inherently first normal formal, so tying a
spreadsheet and a relational database together MAY make sense.

In general though, Einstein said "make things as simple as possible BUT
NO SIMPLER". Relational oversimplifies the database side, which means
the application side is over-complex in order to compensate. (Which is
why Cache blew Oracle out of the water.)

I'm quite happy to wax lyrical, but I'd rather not preach to an audience
who aren't at least interested. Feel free to ask me to continue on list,
or contact me privately, and I'll try to prove everything as
mathematically as I can :-)

but at the same time, it allows for scaling to arbitrarily
large datasets.

At the price of massive unnecessary complexity.

Cheers,
Wol

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.