Tagged: regular expressions
- AuthorPosts
- October 14, 2024 at 4:15 pm #30068DavidParticipant
According to the help, \r\n, \r, or \n should be used in Find in Files/Replace in Files dialog box:
To Specify Newline CharactersI use Replace in Files to change the contents of tens of html files. See the following example.
On the Line 7, the contents is: img {.
While Find/Replace. img {\n works correctly ;but while Find/Replace in Files, img {\r, img {\n and img {\r\n doesn’t work. To be strange , img \{\n works. What’s the reason for that behavior ?<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <style> img { max-width: 100%; align-content: center; height: auto; } .abstract {background-color: #f3ecde;} </style>
October 15, 2024 at 7:50 am #30069Patrick CParticipantWith
\r = carriage return = CR
\n = line feed = LFNew line characters across different operating systems:
All Linux / Unix: Newline = line feed = LF = \n
Old Mac OS: Newline = carriage return = CR = \r
Modern Mac OS: Newline = line feed = LF = \n
Windows & MS-DOS: Newline = carriage return followed by line feed = CR+LF = \r\n
Modern Windows: Newline = carriage return followed by line feed = CR+LF= \r\n
↳ However, as of 2024 Windows applications are increasingly capable of dealing with LF only line terminators (even Notepad).Many tools, EmEditor included, can be configured to let \n detect all newline characters, regardless of type LF, CR+LF or CR.
I.e. you can stick to \n without having to bother about what the current file’s line terminator is.Should you want to use \r\n:
In the find dialogue: ClickAdvanced...
→Treat CR and LF separately
Advanced dialog box → Treat CR and LF Separately check boxOctober 15, 2024 at 12:28 pm #30071DavidParticipantHello Patrik C, Thanks for your reply.
I understand the difference between CR+LF and CR, LF.
Now I’m talking about the feature Find/Replaces in Files. Would you pls have a try the following expression:- Treat CR and LF Separately check box is activated, img {\r\n can works.It’s good.
- Treat CR and LF Separately check box is disactivated, img {\r, img {\n and img {\r\n doesn’t work at all. But
img \{\n(a slash \ before bracket { )works. Alsoimg {\r?\n?So I’m confusing:
1. if Treat CR and LF Separately check box have to activate during Find/Replaces in Files?
2. While- Treat CR and LF Separately check box is disactivated, What‘s the reason for img \{\n (a slash \ before bracket { )and img {\r?\n? can works?
October 15, 2024 at 4:56 pm #30072Patrick CParticipantHello David
Sorry, I should have read your initial question more carefully, as you did explain that To be strange , img \{\n works.
I just tested this with EmEditor Version 24.4.0.
When selecting Escape Sequence instead of Regular Expressions, both Find and Find in Files work with
img {\n
.When selecting Regular Expressions (with the Boost.Regex engine) …
… and with Treat CR and LF Separately deactivated …
… and searching forimg {\n
:
● The regular find (not in files) works, regardless of CR+LF or LF line endings.
● Find in files will not find CR+LF, but it will find LF line endings.After experimenting a bit …
When using Find in Files with Treat CR and LF Separately deactivated:
As soon as the Find term contains a { or a }, \n seems to exclusively find LF but not CR+LF. This even when { is followed by text before the \n.Example:
Line 1 { abcd Line 2 def { abc Line 3
● Find in Files searching for abc\n works with both LF and CR+LF line endings.
● Find in Files searching for def { abc\n works only with LF line endings.
Adding a closing bracket as inLine 1 { abcd Line 2 def {a} abc Line 3
won’t help either, i.e.
● Find in Files searching for def {a} abc\n works only with LF line endings.What also won’t help is a complete regex, e.g. find
ab{1}c\n
.
……
Its as if { or } behave as a token to let \n match LF but no longer match CR+LF, even when Treat CR and LF Separately is disabled.As to why this is: Good question.
I guess that this is for Yutaka to answer.Cheers,
PatrickOctober 16, 2024 at 4:49 am #30073Patrick CParticipantI think I found the root cause:
With
Treat CR and LF Separately disabled
and radio button Regular Expressions enabled.EmEditor’s Find in Files will interpret \n as LF as soon as the search term contains a regular expression token or actual regular expression.
Without a regular expression token, EmEditor behaves as if performing a Escape Sequence type search, presumably because this will speed up Find in File’s performance.{ and } are regular expression tokens, but most engines will treat it as a regular character until written as a complete regular expression, e.g a{3} for aaa.
Most of this is in line with EmEditor’s Help, which states:
● Find dialog box → Regex on → \n or \r\n (same meaning)
I.e. \n will match both CR+LF and LF.
● Find in Files dialog box → Regex on → \r\n, \r, or \n (depends on actual newline character)
Elaborated further in EmEditor Help → Tips
which explains that \r strictly matches CR and \n strictly matches LF, …… what is not explained in help is that
1) the Find in Files regex search will fall back to an escape sequence search when the search term does not contain at least one regular expression token
2) escaped regular expression tokens as in\{
, are no longer identified as regular expression tokens, with Find in Files then performing an escape sequence type search.October 18, 2024 at 8:39 am #30082Yutaka EmuraKeymaster1) the Find in Files regex search will fall back to an escape sequence search when the search term does not contain at least one regular expression token
2) escaped regular expression tokens as in \{ , are no longer identified as regular expression tokens, with Find in Files then performing an escape sequence type search.These are indeed correct. Thank you for sharing your insightful observations.
October 19, 2024 at 4:52 pm #30084DavidParticipantHello Patrik C
Your explanation are so detailed , now it’s clear. - AuthorPosts
- You must be logged in to reply to this topic.