Hylton,
I can offer a few general considerations and thoughts. Your problem
statement appears to contain two objectives:
a) to make consistent the case of all the elements; all Upper Case; all
Lower Case; whatever. It seems the case makes no difference to the
meaning of the record. The case is just an artifact of the typing style of
the data input person(s). In this situation, a mass application of =UPPER
or =LOWER and placing the result in a new sheet would suffice.
I think no amount of special formatting will reliably catch your eye to
eliminate manually all the instances of differing case.
b) the deduplication. Once the case issue is resolved you will have rows
with all elements exactly equal. This deduplication should be done by
exporting the 68000 x 10 sheet to a database and running a deduplication
query. Again any of the visual tricks to identify duplicate rows will be
unreliable and you will be guaranteed to miss at least a few.
At this point my advise bogs down. I have past experience with the M$
Access product; recent versions of Access have a pre-written deduplication
query. It is not available to me right now because I have left my
workplace and don't have an installation of that recent version of M$
Suite.
Use caution in the deduplication process; make lots of backups. It is a
delete-type query and data will be lost! Hopefully only the duplicates but
you never know.
I hope this vague hand waving is of some help to you,
--
David S. Crampton
On Mon, 12 Dec 2011 06:39:31 -0800, Hylton Conacher (ZR1HPC)
<hylton@conacher.co.za> wrote:
Hi,
Using LibreOffice 3.3.1 and am in the process of editing a 68000 row, 10
column file.
There are two main columns that contain the data to be cleaned up with
multiple instances of duplication i.e. the same text but only the text
case differs between two rows or the text is totally different in column
A row 1 and row 2 but the text in column B rows 1 and 2 is identical i.e.
Col A Col B OR Col A Col B
a hx a hx
A hx a hx
OR
a hx a hx
a hy A hy
etc for the other combinations
I am doing the alphabetical sort via Col A.
I can use find to search for the duplicate record row once I know what I
am looking for however determining what test is different when the
values in the Col A are the same and vice versa/
On 136000 cells this is a FAIR mission!
I would like to know if there is a conditional formula I could use that
could highlight the differences in one column when cells in the other
column are the same. I am thinking of a formula that says if the cell
contents are the same as any other cell in a range, apply the
conditional format. Of course this conditional would need to be added
onto all 136000 cells. :(
That way I can highlight the 'error' cells and find them easily and
correct them or add a new row of data.
Any pointers would be appreciated for doing this in Calc as an external
database is not available. What elements of the formula can I
investigate?
Many thanks
Hylton
--
For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.