Quote:
Originally Posted by astrogeek
There are a full spectrum of fuzzy-search tools ...
|
>But a core genius of Unix and Unix-like OSs is a tool box of small light-weight programs that each do one thing well, text in, text out as the universal interface, trivial filters and pipes mechanism to tie it all together. The fundamental paradigm of DOS/Window$/GUI is different and not desirable for many uses on *nix platforms.
No argument here, except the search tools that I know are not fast when GBs of data are to be traversed, databases are another opera, my word is for heavy full-text searches.
>Not just enough, but preferable and superior for many uses!
Agreed, but the uses that are not many in no way are less important.
>And you say you don't use REGEX, so it is you who is not utilizing the available power.
We have this time reversed case, in my work situations needing REGEX fall in that 'not many uses', wildcards and fuzzy top the exact invocations, REGEX in natural language is scarce.
>And it is not at all in evidence that any or all of the Levenshtein implementations on *nix platofrms do not use the available processing power... can you back that up?
Not really, yet, to my knowledge there is no faster fuzzy searcher than Gallowwalker since it uses as an external searcher Kazahana (a console tool both *nix and Windows compiles) running 16 threads. If you know of a fuzzy tool that can stand a chance, please give me links.
>But frankly, you need to go to rehab and get over your GUI-for-all-purposes crippling addiction.
I hear you, however I see no substance in your words, you failed to see that I don't oppose grep to Kazahana - both console searchers, I point out where grep can be strengthened performancewise and featurewise. And don't think that I am against current grep, I am for its successor under another name.
>And having found your result set, in a GUI, can you trivially pipe it into one or more filters for post processing?
No, I cannot, but this is not needed in the GUI variant.
>A superior speed metric for a single limited context (also not actually substantiated) used as a broad brush criticism of other OSs with fundamentaally different contexts is... well, indeed kind of sad.
My criticism is all about tools that are missing in one OS but present in another, it is bidirectional and it is not against the OS itself but about the support/development lagging behind.
I really want to see as much close as possible to the idea of one-stop search tool ready-to-go in some *nix distro, i.e. preinstalled, built-in. Especially I like the live ones where even a novice could boot from an USB stick and is enabled to use some powerful searcher with minimum time-wasting formalities.
>So, by that description, the GUI shell adds overhead not really needed to complete the task. Go to rehab immediately!
I guess your context is the general usage of grep and such, yes? But, see, the amount of texts being traversed ia not comparable to the low latency cases of most grep invocations where quickness comes mostly from the low latency. I speak 2x speedups here for data (as
https://dumps.wikimedia.org/enwiki/2...ticles.xml.bz2) 50GB in size.
Somewhere I gave examples how a simple exhaustive fuzzy search allowed me to find several instances of misspelled "Sylvester Stallone", outside the redirection tags. This feature is important and I haven't seen it in any other searcher, it allows to ensure all usages of given phraseS to be checked for coherency. And 'exhaustive fuzzy' is nothing like 'exact' mode, the amount of computations is nearly 1000x.
>To be fair, you have written code that you are obviously proud of, and apparently offered it freely for others to use ...
The right wording is not about proud but passion related. As you can see I am addicted to speedy textual operations, there is no place for stupid self-importance but for constant performance gains achieved in plain C.
>But I think that you need to reconsider the claims of usefulness of that tool within the *nix context as compared to other available tools in that context, and the fundamental usage paradigm mis-match that results from the DOS->*nix port.
I see, my passion is misinterpreted as a replacer of grep and such, no! My wish is *nix guys who see the results on large files and many cores scenarios to make a new tool with more features (as fuzzy search at each position) and not necessarily as a GUI but as a supergrep.
I like your poetic signature but you seem to treat one all-about freedom&speed coder as a sick man, meaning rehab urges of yours, don't get how you miss niches where grep can be developed further.
>It was very likely the GUI slowing it down...
Heh-heh, no, the try was an old 42GB xml file and the application (with very well/tightly designed controls) ran 30 minutes with no output where my hitter finished in under 3 minutes. I guess its slow CPP unoptimized code was the real reason.
>Grep needs a GUI about like...
Again, I talk of one-stop searcher performing faster and having more features, supergrep in one word, as for the GUI, I see it only as a shell using that same console grep, so no tension of any sort here.
To me, the accompanying tools to one OS, mostly GUI ones, are the single most important thing that repels and attracts many users, a fellow-member above said he would migrate from Windows to *nix if some applications are available there too. For a long time MS Office was such a attractor, and after Open Office appeared I saw many users starting using Linux abandoning Windows altogether, it shows how some applications are crucial for people and if an alternative is given that will change their choice. In my case, having some powerful GUI supergrep would make me using *nix as a desktop/laptop OS-of-choice.