Go Job Hunting at the LQ Job Marketplace
Go Back > Forums > Linux Forums > Linux - General
User Name
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.


Search this Thread
Old 03-10-2009, 01:57 AM   #1
Registered: Apr 2007
Location: India
Distribution: Ubuntu 10.04, RHEL/Centos 5.x, Knoppix
Posts: 41

Rep: Reputation: 3
Using grep to filter phrase before a space

Ok, I downloaded the lastest KNOPPIX cd, and thought it would be cool to do a md5sum on it, even though bittorrent ensures the checksums are verified.

So after doing a md5sum and saving it to "sum1", I do a
diff KNO*.md5 sum1
and it isn't the same. Simply because the md5 checksum file that comes with knoppix looks like this, :
d642d524dd2187834a418710001bbf82 *KNOPPIX_V6.0.1CD-2009-02-08-EN.iso
and "sum1" looks like this:
d642d524dd2187834a418710001bbf82 KNOPPIX_V6.0.1CD-2009-02-08-EN.iso
Notice the missing asterisk.

So how do I use grep, to tell it to take the data before a space occurs, so that in both cases it'll only take the checksum value, and not the *KNOP.. or KNOP..?

It obviously isn't a critically important question, I'm just learning my way around the commands.. So just playing around to see what's possible.. Can anyone help?
Old 03-10-2009, 02:24 AM   #2
LQ Veteran
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,940

Rep: Reputation: 1200Reputation: 1200Reputation: 1200Reputation: 1200Reputation: 1200Reputation: 1200Reputation: 1200Reputation: 1200Reputation: 1200
Have you considered adding the aster to sum1 (given that you created that file) ?. Saves creating new files to diff, and the regex for (that) using sed will be much simpler.
Old 03-10-2009, 02:27 AM   #3
Registered: Feb 2006
Distribution: Fedora
Posts: 341
Blog Entries: 3

Rep: Reputation: 38
That asterisk comes because md5sum has been run with the '-b argument. Since you already have the md5 checksum file, you can directly verify the iso file by running
md5sum -c KNO*md5
And it should give you something like KNOPPIX_V6.0.1CD-2009-02-08-EN.iso: OK
Old 03-10-2009, 02:53 AM   #4
Registered: Sep 2005
Location: Sri Lanka
Distribution: Fedora (workstations), CentOS (servers), Arch, Mint, Ubuntu, and a few more.
Posts: 441

Rep: Reputation: 40
You can use cut command to get only the text you want from the files. However it will not solve your problem as diff only takes files as parameters (and STDIN by using "-" parameter).

$ cut -d" " -f1 KNO*.md5

The above command will give you the output of d642d524dd2187834a418710001bbf82

What it does is, take the the first column (-f1) of the file (KNO*.md5) where column delimiter (-d) is a space (" "). The command could be alternatively written $ cut -d\ -f1 KNO*.md5 (notice the extra space after the "\")

If you want to do more stuff with this, better to look into awk too.
Old 03-12-2009, 12:48 PM   #5
Registered: Apr 2007
Location: India
Distribution: Ubuntu 10.04, RHEL/Centos 5.x, Knoppix
Posts: 41

Original Poster
Rep: Reputation: 3
syg00! Yes, I know I can manually add the asterisk! hehehe.. I'm just learning to manipulate the commands in linux, so wanted to know if it's possible to get a perfect diff without changing it manually..

I didn't know md5sum would check it automatically.. Thanks! I thought I would always have to at least visibly compare checksums.. Makes life much easier.. Who says linux isn't user-friendly??

Yes! That's exactly what I wanted! Thank you! And yes, you're right.. diff needs two file names as parameters.. thanks for pointing that out. Although I shall look into the 'cut' command. Looks very interesting.. I'm guessing in "cut -d\ -f1 KNO*.md5", the backslash is the escape character. I think 'grep' and 'cut' can work quite well with each other..
I started learning python mainly for scripting purposes, but I realised it's much more powerful than I thought.. I'll finish that first, but I'll definitely look into awk and sed...
Old 03-13-2009, 01:20 AM   #6
Registered: Sep 2005
Location: Sri Lanka
Distribution: Fedora (workstations), CentOS (servers), Arch, Mint, Ubuntu, and a few more.
Posts: 441

Rep: Reputation: 40
@ShanxT, Yes the backslash is the escape character. And about sed and awk, it's be great to learn them. I never got to learn sed and awk much since when my scripting needs got complicated I switched to Ruby. I'm a happy camper as there's an awful lot of SysAdmin tools in Ruby (probably the best collection). Glad to know you are learning Python too.


cut, diff

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can grep filter out words? extrasolar Linux - General 1 07-20-2006 03:14 PM
Grep-like filter exists? carl.waldbieser Programming 3 08-31-2005 11:34 PM
grep a phrase degraffenried13 Linux - General 1 04-04-2004 11:10 AM
grep [exact phrase] chrisfirestar Linux - General 2 02-09-2004 04:30 AM
Need help with grep, trying to parse/filter a file... patsnip Programming 4 08-29-2003 02:33 PM

All times are GMT -5. The time now is 06:53 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration