- AuthorPosts
- August 24, 2011 at 5:10 pm #9594DeipotentParticipant
Further to my post on a regular expression to delete duplicate lines, and your reply that r is ignored, it would be useful if EmEditor stripped out r from regexes. Obviously this is a bit more complicated that just removing “r” since the r might be in parentheses or referred to afterwards (as is the case in the linked regex).
http://www.emeditor.com/modules/newbb/viewtopic.php?topic_id=3&forum=3&post_id=5812#forumpost5812
Also, how come a search on with a regex of “r” will match every character ?
August 24, 2011 at 8:02 pm #9599StefanParticipant> “it would be useful if EmEditor stripped out r from regexes”
I think that would be too smart and unexpected for many.
The user just have to learn how it works.
If not already, there should be an explanation incl. example in the help, that should be enough.August 25, 2011 at 1:08 am #9602DeipotentParticipantI don’t mean strip it out of the edit box, I mean just strip it out of the regex that EmEditor will use internally – so the user wouldn’t know any difference, but these valid regex will work.
At the moment, some valid regex are reported as incorrect by EmEditor (see link to my other post for an example of finding duplicates)
August 25, 2011 at 12:54 pm #9603StefanParticipantDeipotent wrote:
I don’t mean strip it out of the edit box, I mean just strip it out of the regexYes, basically the very same.
I (the user) try to do smtg, and the app do smtg other in the back.
And i (the user) get the impression that i did it right, what is not the truth.I know that that is common (see IE which autom. add http:// in front of an URL, or how Win7 lies with UAC virtualizing)
but i did not want to see that more often.August 25, 2011 at 2:20 pm #9604DeipotentParticipantAnd i (the user) get the impression that i did it right, what is not the truth.
But you would have done it right, with a valid regex. Currently, EmEditor errors out when some valid regex are used.
I think it’s preferable to make EmEditor adhere to standards, rather than making people who are familiar with regex’s modify them to just to get them to work with EmEditor.
To use your analogy, EmEditor currently does it like IE of old, where people have to put in workarounds to get a regex to work. My proposal is to make EmEditor more like the latest versions of IE which are more standards compliant, and so work with valid regex’s.
August 25, 2011 at 5:46 pm #9605DeipotentParticipantAfter thinking about this again, I re-looked at the regex in my linked post for finding duplicate lines, and realised that there appears to a bug in EmEditor’s regex parser.
My suggestion for ignoring r already appears to be implemented. Let me give some examples to explain:
1) Even if EmEditor didn’t ignore r, the following regex should work as the part, “r?”, is saying match zero or one occurrences of r. So it should match whether there is a r or not. But it doesn’t.
^(.*)(r?n1)+$
2) However, if you remove the ‘?’, so the regex becomes as follows, it works, and matches duplicate lines:
^(.*)(rn1)+$
3) It also works if you have the original regex, but replace “r” with “(r)”:
^(.*)((r)?n1)+$
So, the bug appears to be that EmEditor seems to have a problem applying ‘?’, when it follows a r. It already seems to just remove the “r”, so the original regex ends up being turned into
^(.*)(?n1)+$
which doesn’t work.
September 22, 2011 at 11:37 pm #9663CrashNBurnMemberThe (? syntax is something entirely different.
Example, given the following Text
This is a test sentence
This is a test sentence
of a regex string
of a regex stringTry this regex:
Search: ^(.*)(?|(a test )|(a regex ))(.*)$
Replace: 3With the (?| syntax, there will only be 3 pieces, since the section inside (?|…) isn’t counted as a bracket set.
For the first line:
1: This is_
2: a test_
3: sentenceThe lines will be replaced with “sentence” and “string”.
Now without the (?| syntax, eg:
Search: ^(.*)((a test )|(a regex ))(.*)$ that regex would contain 5 pieces that would be matched with 1 thru 5, and some of them would be empty, depending on the line.For the first line:
1: This is_
2: a test_
3: a test_
4:
5: sentenceThe (? syntax becomes important if you utilize more complex regexes due to the nature of regex to match everything it absolutely can when things like “.*” are present.
Example 2A:
Search: ^(.*)( is a | a regex )?
Replace: 1_That regex will match the whole line, and the replace wont change anything — as the .* consumes the whole line, since the part in (brackets)? is optional, the greedy regex wont see it.
Example 2B:
Search: ^(.*)(?|( is a )|( a regex ))
Replace: 1_This regex will actually remove ” is a ” and ” a regex ” from the lines in question.
A lot of editors don’t even implement the full regex spec, e.g. Notepad2 does a small subset which doesn’t even handle the optional “?” correctly, and certainly doesn’t support the (?| syntax.
- AuthorPosts
- You must be logged in to reply to this topic.