LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-17-2020, 02:24 AM   #1
blason
Member
 
Registered: Feb 2016
Posts: 114

Rep: Reputation: Disabled
How do I achieve below results in single iteration?


Hi Folks,

I have hash files comprises of MD5/SHA1/sha256 hashes. Now would like to segregate in single iteration and I am clueless here how to achive that.

Probably running 3 iterations for MD5; then SHA1 would help but then I'll have to rerun the file 3 times which I wanted to avoid.

I was thinking with while read loop but bi cluess here.

here is my file for example
Quote:
6cc9625971accf65e2cf2c03ebf47260939a39f50a7e12d0ed31b4b2f59aaba6
6ccd25b1b194da5e637f35db48bf2d4c44adad1af31162e02b02ffff087b27dd
7B55EDB28DEE19EB0CDAFBF7A306F9153DE00900
81B523E430B9968631D5453A4625FDE9
A3729FB27CAE5EF165BD04E879979DEB478B53EEA29D343D5165FA14CBF16ADF
My regex are

Quote:
cat t | grep -E '\b([a-z0-9A-Z]{32})\b'
cat t | grep -E '\b([a-z0-9A-Z]{64})\b'
cat t | grep -E '\b([a-z0-9A-Z]{40})\b'
 
Old 08-17-2020, 02:52 AM   #2
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,352

Rep: Reputation: 366Reputation: 366Reputation: 366Reputation: 366
Hi

Wouldn't simply sorting the file by line length work?

https://stackoverflow.com/questions/...cluding-spaces
 
Old 08-17-2020, 03:08 AM   #3
blason
Member
 
Registered: Feb 2016
Posts: 114

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Guttorm View Post
Hi

Wouldn't simply sorting the file by line length work?

https://stackoverflow.com/questions/...cluding-spaces
Yes awk would help but my query is about reading the file in single iteration and sorting those hence I am clueless there.
 
Old 08-17-2020, 03:09 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 19,097

Rep: Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325Reputation: 3325
Quote:
Originally Posted by blason View Post
Probably running 3 iterations for MD5; then SHA1 would help but then I'll have to rerun the file 3 times which I wanted to avoid.
Why do you care ?.
Given the amount of memory in machines these days, the file (unless really large) will likely remain resident in page cache. There will be no I/O cost in re-reading the file.
 
Old 08-17-2020, 03:25 AM   #5
blason
Member
 
Registered: Feb 2016
Posts: 114

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
Why do you care ?.
Given the amount of memory in machines these days, the file (unless really large) will likely remain resident in page cache. There will be no I/O cost in re-reading the file.
there are around 46k+ entries available that I wanted to sort. That means 46kx3 - Instead thinking if any way to complete in single iteration.
 
Old 08-17-2020, 03:25 AM   #6
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 5,059
Blog Entries: 3

Rep: Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521Reputation: 2521
As mentioned, it is probably cached after the first pass but you could try with AWK or Perl. Here is one possible way with AWK:

Code:
#!/usr/bin/awk -f

/^([a-zA-Z0-9]{32})\>/ { print >> "./b32.txt"; next; }

/^([a-zA-Z0-9]{40})\>/ { print >> "./b40.txt"; next; }

/^([a-zA-Z0-9]{64})\>/ { print >> "./b64.txt"; next; }
Or if the hash does not always occur at the start of a line, use \< in place of ^ there. There might be some different behaviors for the different variants of AWK though. I haven't checked.
 
Old 08-20-2020, 12:00 AM   #7
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,454

Rep: Reputation: 664Reputation: 664Reputation: 664Reputation: 664Reputation: 664Reputation: 664
Let awk print each record length to a distinct file:
Code:
awk '{ print > (length($0) ".txt") }' inputfile
The > overwrites an existing file. (A >> like in the previous post would append to it.)
 
2 members found this post helpful.
Old 08-25-2020, 11:33 PM   #8
blason
Member
 
Registered: Feb 2016
Posts: 114

Original Poster
Rep: Reputation: Disabled
Thanks a lot guys for your valuable inputs. I finally have decided to go with 2 iteration and remove md5 and instead go for sha1 & sha256
 
  


Reply

Tags
awk


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Shell script query - How do I achieve the results manya Programming 4 02-07-2010 01:23 AM
thread creation fails after a number of iteration sudevank Programming 1 05-26-2008 09:45 AM
Loop iteration in Linux scripting. 151803 Linux - Newbie 3 03-19-2007 06:36 AM
force grep to keep it's place in file for next iteration? jeffreybluml Programming 7 05-13-2005 10:16 AM
pronounce 'iteration' Ikebo General 5 09-28-2004 07:25 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration