- AuthorPosts
- November 7, 2007 at 8:11 pm #4957tungwaiyipMember
I’m searching for a string among 2500 files spreading over a directory tree. The performance number:
Another program: 2 seconds
EmEditor: 9 minutesThis translates to 4.6 files per second for EmEditor v.s. 1250 files per second for the other program.
Another interesting observation. There are two types of files, one with .rst extension and one with .html extension. The search seems to go much faster with .html files. It takes 65 second to search through 1000 .html files. This translates to 15 files per second. Not great, but much better than average. The total files size of the .html files actually make up 75\% of the original search.
November 7, 2007 at 8:27 pm #4959Yutaka EmuraKeymasterThanks for descriptions, but it might be more helpful if you can write which encodings those files are written with, and whether you use regular expressions, or escape sequences including new lines. A possible reason for the delay is that EmEditor internally works in Unicode, and it support Unicode-aware regular expressions. You can write exactly what you search for, and what the files are like, and you can email me sample files and more descriptions at [email protected]
Thanks!November 7, 2007 at 8:33 pm #4960Yutaka EmuraKeymasterAlso, make sure you write which options (such as Regular Expressions, Match Catch, etc.) you checked in the Find in Files dialog box. Using different options can make significant difference in search speed. Thanks!
November 7, 2007 at 9:16 pm #4961tungwaiyipMemberI’ve figured it out. The problem is in the “Encoding” listbox. The default “Configured Encoding” is the slowest. Choosing anything other than “Configured Encoding”, even if you pick something random like Korean, will give you good performance.
The other options I have set is “Look in subfolder” and “Use Escape Sequence”. Unfortunately I cannot post my data file. But this is irrelevant. I repeat the search over 1000 files from an open source project and the result is the same.
November 8, 2007 at 5:27 pm #4965Yutaka EmuraKeymasterWhen “Configured Encoding” is selected in “Find in Files” dialog box, EmEditor uses the encoding configured in the File tab of the configuration properties assiciated with the file extension you are searching. In default, Text configuration uses “System Default” encoding with UTF-8 and Unicode signature detection. The UTF-8 detection can make search slower. If the “Detect All” is also checked in the File tab of the configuration properties, it may become even more slower. Please let me know which options are checked in the File tab of associated Configuration Properties.
Also, in what encoding do those files you search are encoded? UTF-8 or Western European (CP:1252) or any other?
November 8, 2007 at 6:58 pm #4966tungwaiyipMemberIn the File tab I have these options selected:
* Prompt if null char found
* Prompt if invalid char
* show file name w/full path
* Detect BOM
* Detect UTF-8
* Prompt at inconsistent returns
* Opening Encoding: UTF-8The files I have searched should be all Ascii files.
Even if EmEditor is doing more detection, it just doesn’t sound right to me that one program can search all files in 2 seconds but it takes EmEditor 9 minutes. Also if I select any encoding, says UTF-8, the performance goes up dramatically to a few seconds.
Of course now I have a workaround as stated in my previous sentence :-)
November 8, 2007 at 7:27 pm #4967Yutaka EmuraKeymasterYou should change the Opening Encoding in the File tab of the configuration properties from “UTF-8” to “System Default Encoding”. That should solve this issue.
November 9, 2007 at 1:39 am #4968shaohaoMemberYou’d better use register mode — save all settings in register, not the INI file mode — save all settings in .ini files.
The INI file mode will slow down the searching in files.
November 9, 2007 at 5:53 pm #4977tungwaiyipMemberThis surely is a bug isn’t it? There is no explanation why it would take so long for certain combination of encoding setting.
Besides I tried you recommendation. It doesn’t help. Picking “utf-8” or any other encoding in the Find in File dialog however does solve the problem.
Is it true that choosing .ini for configuration slow things down? This is one thing to make me love EmEditor 7.
November 9, 2007 at 6:10 pm #4978Yutaka EmuraKeymasterPlease try beta 32, and you should find better performance. Thanks!
November 15, 2007 at 7:33 pm #4998tungwaiyipMemberUpgraded to b32 and found the same issue.
November 15, 2007 at 10:01 pm #4999Yutaka EmuraKeymasterYou should have seen at least some improbements. Can you please describe more details that might help me to reproduce your issue? Without descriptions, I won’t be able to reproduce your issue. Thanks!
November 15, 2007 at 10:18 pm #5000tungwaiyipMemberThe procedure is the same as what has been posted in this thread. I see no difference between b29 and b32.
One more data point. I’m using the “exported to USB drive” version. If I just run the version installed in “C:program files” it does not seems to have this problem.
November 15, 2007 at 11:35 pm #5001Yutaka EmuraKeymasterEmEditor is optimized for the Registry usage, and the USB install will slow down. If the speed is a top priority, please consider using the Registry (without INI files). I will optimize a little more for INI files, but it won’t become as fast as the Registry version.
- AuthorPosts
- You must be logged in to reply to this topic.