Hello everybody.
Probably the title of this post is not very clear, sorry for that ;).
I have a bunch of text (html code) and need to find <p> tags with their
classes, id, styles (if any) etc. I'm doing this using the following regexs:
<p(.*?)> or (<p([^>]+))>
The pattern of my text is here:
<p class="navi_buttons">Lorem ipsum dolor sit amet, consectetur
adipiscing elit.</p>
<p class="reg">Aliquam mi sapien, rutrum eget sem vel, semper
efficitur.<a href="xyz.html" class="topiclink">vitae velit</a></p>
<p class="THIS_SHOULD_BE_AVOIDED">Donec fringilla sapien vitae interdum
volutpat.</p>
<p class="nav">Cras nec orci non dolor ultrices luctus sit amet vitae
velit.</p>
The problem is that I need to find every occurrence of <p> tag except
one certain class (i.e. I want to avoid paragraph tags of this class). I
don't know how to write a regex exclusion that is treated as a string,
not a set of the individual characters? I tried to use back-references,
with no success. I want to use regex because the tag classes, to be
avoided, are different on each page (but they keep a certain pattern)
and a the job should be done as automatic as possible (the code should
be as versatile as possible).
I will appreciate any help. Kind regards,
gordom
--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted
Context
- [libreoffice-users] LO Writer and regex - finding "everything" but one thing · gordom
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.