Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index

At 09:26 20/01/2014 -0800, Hugo Borrell wrote:
I download big raw data containing that is supposed to show minute per minute quotation of an index (SP500). The problem is that some quotations have been included twice in the same minute (at some seconds of distance), for example here

2013-10-11 13:47:51;1701.0601;1701.21;1701;1701;696500
2013-10-11 13:47:57;1701.02;1701.21;1700.99;1700.99;644100
2013-10-11 13:48:51;1700.97;1700.97;1700.7;1700.7;496500
2013-10-11 13:49:51;1700.67;1700.67;1700.53;1700.53;369700
2013-10-11 13:49:57;1700.6801;1700.6801;1700.52;1700.52;374800
2013-10-11 13:50:57;1700.51;1700.51;1700.35;1700.37;441100

I tried to clean that manually but it appears to be a huge work. I'd like to have a smarter way to ask calc to keep only one data of each minute time, like this :

2013-10-11 13:47:51;1701.0601;1701.21;1701;1701;696500
2013-10-11 13:48:51;1700.97;1700.97;1700.7;1700.7;496500
2013-10-11 13:49:51;1700.67;1700.67;1700.53;1700.53;369700
2013-10-11 13:50:51;1700.47;1700.47;1700.35;1700.35;444900

The hard part is that not all minutes have twins (erasing one out of two wouldn't do) : here only 13:49 was twinned.

How could that be solved ?

Let's imagine your values are in columns A to F.

o I'm guessing that, if you need only one value per minute, you don't need to know the seconds value of the timestamp. In a spare column, enter
and fill or copy that down the column. This will round your timestamps to complete minutes. Now copy this column and paste it back over column A (replacing your original values), but using Paste Special... instead of ordinary Paste and ensuring "Paste all" and Formulas are *not* ticked in the Paste Special dialogue.

o At a convenient place, perhaps elsewhere on the sheet or on another sheet, create a list of times containing only one value per minute. (I'm going to assume that these values are also in column A starting at row n.) To do this, enter the first two values manually (here 2013-10-11 13:47:00 and 2013-10-11 13:48:00), select both, and then fill them down the column as necessary. Now all you need to do is to harvest the appropriate values for subsequent columns.

o In Bn, enter
(with the real row number in place of "n" and the real final row number in place of "999", of course). First fill this formula down column B. Then select all the relevant cells in column B (containing this formula, as modified) and fill these across columns C to F.

How does it work? Then VLOOKUP() function searches the first column of the range of original data for the first occurrence of the appropriate timestamp. When it finds a match, it copies the value from the appropriate later column. Using COLUMN() in place of an explicit column number allows the formula to adjust automatically for later columns. (If your cleaned data are not in columns A to F, you will need to modify this parameter to COLUMN()-1 or whatever.) The final parameter needs to be zero (or FALSE) to indicate that if there is no value for that minute, you will (presumably) want to see an error condition instead of a value unhelpfully harvested from a nearby minute.

I trust this helps.

Brian Barker

To unsubscribe e-mail to:
Posting guidelines + more:
List archive:
All messages sent to this list will be publicly archived and cannot be deleted


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.