[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [libreoffice-users] Multiple newlines
- Subject: Re: [libreoffice-users] Multiple newlines
- From: Brian Barker <firstname.lastname@example.org>
- Date: Mon, 20 Jul 2020 23:49:30 +0100
- To: <email@example.com>
At 16:38 20/07/2020 -0400, John Kaufmann wrote:
Documents archived in Project Gutenberg are typically simple text, with each line ending in <CR><LF> (Hex:0D0A), so that paragraphs are separated by an empty line <CR><LF><CR><LF>. I thought it would be simple to convert one such (5657.txt) to format in Writer, ...
... but stumbled on elementary problems in Find-&-Replace [Ctrl-H] using regular expressions:
(1) "\n" is not found. Should not "\n" match one of the codes in <CR><LF>? [If not, what code(s) should "\n" match?]
First, once you have your text in a word processor, you do not have <CR> or <LF> or <CR><LF> or anything else like that in your text; instead you have *paragraph breaks*. There is no character there, despite the pilcrow that you can get Writer to display. And what you are calling "empty lines" are actually empty paragraphs. "\n" in the "Search for" field matches line breaks, not paragraph breaks. (And line beaks are line breaks - also no "codes".)
(2) Although "$" is found (matches to <CR><LF>), ...
No, "$" does not match anything; instead, it anchors the expression before it to the end of a paragraph. So an expression ending with "$" will match text only if it comes at the end of its paragraph.
... "$$" (for successive occurrences of <CR><LF>) is not found. Why?
"$$" has no sense. If anything it means "this pattern needs to match something that is *really, really* at the end of a paragraph"!
(3) Doing Find "$" & Replace with " " (single space), <CR><LF> is replaced by " " (single space). However, doing Find "$" & Replace with "@" (single @char), <CR><LF> is replaced by "@@" (double @char). Why?
I don't think that's true. In any case, there are no <CR><LF>s present.
To achieve what you want:
First combine single-line paragraphs:
o Apply Default paragraph style to all the text.
o Select all the text.
o Apply AutoCorrect.
(You may need to adjust the minimum length of such paragraphs in AutoCorrect Options - possibly to 0%.)
Then remove empty paragraphs:
o Search for "^$" (no quotes) and replace with nothing.
("^" anchors your pattern to the start of a paragraph and "$" to the end. So "^$" matches a paragraph with nothing in it.)
I trust this helps.
To unsubscribe e-mail to: firstname.lastname@example.org
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
|Re: [libreoffice-users] Multiple newlines||John Kaufmann <email@example.com>|
|[libreoffice-users] Multiple newlines||John Kaufmann <firstname.lastname@example.org>|
- Prev by Date: [libreoffice-users] Multiple newlines
- Next by Date: Re: [libreoffice-users] Multiple newlines
- Previous by thread: [libreoffice-users] Multiple newlines
- Next by thread: Re: [libreoffice-users] Multiple newlines