At 09:26 20/01/2014 -0800, Hugo Borrell wrote:
I download big raw data containing that is supposed to show minute
per minute quotation of an index (SP500). The problem is that some
quotations have been included twice in the same minute (at some
seconds of distance), for example here
I tried to clean that manually but it appears to be a huge work. I'd
like to have a smarter way to ask calc to keep only one data of each
minute time, like this :
The hard part is that not all minutes have twins (erasing one out of
two wouldn't do) : here only 13:49 was twinned.
How could that be solved ?
Let's imagine your values are in columns A to F.
o I'm guessing that, if you need only one value per minute, you don't
need to know the seconds value of the timestamp. In a spare column, enter
and fill or copy that down the column. This will round your
timestamps to complete minutes. Now copy this column and paste it
back over column A (replacing your original values), but using Paste
Special... instead of ordinary Paste and ensuring "Paste all" and
Formulas are *not* ticked in the Paste Special dialogue.
o At a convenient place, perhaps elsewhere on the sheet or on another
sheet, create a list of times containing only one value per
minute. (I'm going to assume that these values are also in column A
starting at row n.) To do this, enter the first two values manually
(here 2013-10-11 13:47:00 and 2013-10-11 13:48:00), select both, and
then fill them down the column as necessary. Now all you need to do
is to harvest the appropriate values for subsequent columns.
o In Bn, enter
(with the real row number in place of "n" and the real final row
number in place of "999", of course). First fill this formula down
column B. Then select all the relevant cells in column B (containing
this formula, as modified) and fill these across columns C to F.
How does it work? Then VLOOKUP() function searches the first column
of the range of original data for the first occurrence of the
appropriate timestamp. When it finds a match, it copies the value
from the appropriate later column. Using COLUMN() in place of an
explicit column number allows the formula to adjust automatically for
later columns. (If your cleaned data are not in columns A to F, you
will need to modify this parameter to COLUMN()-1 or whatever.) The
final parameter needs to be zero (or FALSE) to indicate that if there
is no value for that minute, you will (presumably) want to see an
error condition instead of a value unhelpfully harvested from a nearby minute.
I trust this helps.
To unsubscribe e-mail to: email@example.com
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Impressum (Legal Info)
: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (MPLv2
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our trademark policy