Date: prev next · Thread: first prev next last
2011 Archives by date, by thread · List index


At 15:06 27/06/2011 +0100, Séamas Ó Brógáin wrote:
I wonder if some regex wizard knows of a way to delete HTML (or XML or whatever) tags from a text file: in other words, selecting everything from each occurrence of < to the next occurrence of >.

I'm not sure I qualify as a wizard, but I'll rise to the bait! Try searching for
<[^>]*>
and replacing with nothing. The circumflex means "not", and combined with ">" in (square) brackets as [^>] matches any character which is not ">". The asterisk extends this to match zero or more of such characters. Combined with "<" and ">", this matches what you need.

I trust this helps.

Brian Barker


--
Unsubscribe instructions: E-mail to users+help@global.libreoffice.org
In case of problems unsubscribing, write to postmaster@documentfoundation.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.