Date: prev next · Thread: first prev next last
2014 Archives by date, by thread · List index


I've asked a question on this list about soffice, and this topic is the reason I was asking.

Let's say we can use the command line to 1) convert an .odt to an .fodt file, and 2) to convert it back again.

If so, I can write a script that uses soffice to do
1) as above
perl -e .... # perform the text substitutions
2) as above to re-establish the modified .odt file.

Peter West
"Other seed fell among thorns, and the thorns grew up and choked it..."

On 29/01/2014 1:45 am, CVAlkan wrote:
In previous posts, I described how Writer adds extra index markers
when updating an Alphabetical Index. One side effect of this behavior
is that, even if an item is later removed from the concordance file,
the marker remains in the text, and therefore in the index.

So, here's how to remove all of the index markers from a Writer
document so you can start with a clean slate. To do this, you will
need to be running LibreOffice on some flavor of Linux/Unix, or at
least on a system that has a command line or some text editor with
"sed" or "grep" capabilities.

1: Make a backup of your Writer document. You know the consequences
if something goes amiss. 2: Open the document in Writer, and choose
Save As "OpenDocument Text (Flat XML) (fodt)" This creates an
uncompressed XML version of the document. On my system (Ubuntu), I
was unable to decompress the odt version, as the OS complained it was
malformed, but using the native capability is always a better idea.
3: Close the document and exit Writer. 4: Open a command line shell,
preferably in the directory containing the fodt file. 5: Run the
following command (all one line - broken apart here for clarity): sed
's/<text:alphabetical-index-mark
text:string-value="\([A-Za-z]*\)"\/>//g' <
Old_File_Name_and_Path.fodt
New_File_Name_and_Path.fodt
Depending on the file size and processor speed, this may take a bit.
If this gives errors, you're on your own. 6: Close the command line
shell. 7: Open the new "cleansed" fodt file with Writer. 8: The file
should look the same but without any alphabetical index markers.
(Your index formatting is still there, though) 9: Go to where your
alphabetical index is located, right click on it and select "Update
Index/Table" A: All of the index entries should disappear; if any
remain, go find them on the referenced pages and manually delete
them. Apparently, some of the indexes are embedded in others and
aren't found by the sed command above. I didn't bother to try
figuring out how or why that happened. I had several hundred markers,
of which only five weren't removed. B: Now, go back to the index and
select Edit Index/Table, then File | Open. C: Select the original
concordance file (assuming you have it set up how you want it), and
let Writer go do its thing. D: You now have a "clean" document with
no duplicate index entries. E: LOOK AT IT CAREFULLY, of course,
before replacing your original. The document I tried this on was over
four hundred pages with lots of tables, graphics and so forth, and I
found no problems, but it's up to you to determine if everything is
ok.

I hope this helps any others who might be using alphabetic indexes.




-- View this message in context:
http://nabble.documentfoundation.org/Removing-Index-Markers-from-Writer-a-How-To-tp4094327.html


Sent from the Users mailing list archive at Nabble.com.


--
Peter West
"Other seed fell among thorns, and the thorns grew up and choked it..."

--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.