At 15:06 27/06/2011 +0100, Séamas Ó Brógáin wrote:
I wonder if some regex wizard knows of a way to
delete HTML (or XML or whatever) tags from a
text file: in other words, selecting everything
from each occurrence of < to the next occurrence of >.
I'm not sure I qualify as a wizard, but I'll rise
to the bait! Try searching for
<[^>]*>
and replacing with nothing. The circumflex means
"not", and combined with ">" in (square) brackets
as [^>] matches any character which is not
">". The asterisk extends this to match zero or
more of such characters. Combined with "<" and
">", this matches what you need.
I trust this helps.
Brian Barker
--
Unsubscribe instructions: E-mail to users+help@global.libreoffice.org
In case of problems unsubscribing, write to postmaster@documentfoundation.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
- Re: [libreoffice-users] Regular expressions · Brian Barker
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.