LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-04-2009, 05:36 PM   #1
adelaide98
LQ Newbie
 
Registered: May 2009
Posts: 13

Rep: Reputation: 0
Arrow script with increasing variable number


Hello,

I am trying to write a script that greps a number sequentially. I got it to work in a script using the following:

Code:
#!/bin/sh
for i in 37 38 39 179 180 181
do
     echo -n $i" "
     grep -c "THE NUMBER $i " myfilename.tbl
done
The output is:
37 3
38 4
39 8
180 10
181 8

Or, the number of times that the number 37 followed the text "THE NUMBER" in my big table, etc. for each of the 5 numbers I defined in the for statement.

However, what I'd really like to do is make a table with all numbers from 37-181. I attempted to do this with a while statement, but it didn't work.

Code:
#!/bin/sh
set i = 37
while [$i -le 182]
     echo -n $i" "
     grep -c "THE NUMBER $i " myfilename.tbl
     @_i++                                       # this is what I was
                                                 # told would advance
                                                 # count of "i"...is 
                                                 # it right?
end
So, does anyone have any suggestions on how to make my "while" command work, or another way to do this? Otherwise, I guess I will just type out all the numbers from 37 to 181 manually...

Thanks in advance for your help. This is my first post.

-Andrea
 
Old 05-04-2009, 06:28 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Nope. The expression @_i++ is valid in the C-shell. If you use the Bourne shell /bin/sh you can increment the value of the variable using expr:
Code:
i = `expr $i + 1`
but you can use the seq command to retrieve a sequence of numbers:
Code:
for i in `seq 37 181`
do
  echo -n $i" "
  grep -c "THE NUMBER $i " myfilename.tbl
done
If you use the Bourne Again shell /bin/bash, you can also try the extended brace expansion:
Code:
for i in {37..181}
do
  echo -n $i" "
  grep -c "THE NUMBER $i " myfilename.tbl
done
 
Old 05-05-2009, 09:19 AM   #3
adelaide98
LQ Newbie
 
Registered: May 2009
Posts: 13

Original Poster
Rep: Reputation: 0
sh and bash solutions worked!!

Hi Colucix,

Thanks very much for your prompt reply!

The while statement I had still didn't work.
In csh w/
Code:
@_i++
the error was "while: Expression Snytax"
In sh w/
Code:
i = `expr $i + 1`
the error was "syntax error: unexpected end of file"

The sh solution gave me the error: "count.com: line 16: seq: command not found" (I thought it worked, then tried to do it again, and could not make it work. So I guess it doesn't work ?!)

The bash solution worked!

All of this underscored for me that the shells are very different! I had received the while script from someone else, then had googled and found the for...do...done solution, but had not paid attention to which shell it was in. So thanks very very much for all of the different solutions!

I am embarking on migrating away from Excel to manage a large data table that must have many manipulations made to it before putting the revised and cleaned data table into another program. Each time I manipulate the table, it takes multiple steps that must be repeated every time I add to my data (takes 1-3 days, and I'm actually good at Excel). So for a programming language to learn, I'd like to be able to manipulate columns of data easily. One problem will be:

Column A Column B
L M
M L
P Q
Q R

I'll need to be able to sort through and delete ML because it is really the reciprocal of LM. Or perhaps have a script to notify myself that PQ does not in fact have a reciprocal QP present.

I have previously been told that either perl (b/c it is good at manipulating data) or python (b/c some of the other programs I'm using are written in python) might be good to learn. But now I'm thinking that shell scripting might be the way to go because I do know some of the commands already (sort of) and with your help just wrote a working script!

So, if you're still reading this, my question to you is which programming language do you think would be most helpful for my needs, and if you think that shell scripting is adequate, then which? (sh, csh, bash, zsh, ksh, did I get them all??) By the way, I'm on a MacBookPro running OSX-Leopard.

Thanks again,
Andrea

Last edited by adelaide98; 05-05-2009 at 09:36 AM. Reason: the sh solution actually didn't work (or at least it doesn't anymore).
 
Old 05-05-2009, 10:03 AM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
good effort on that bash script, however, a bit inefficient. With the for loop on a range of 37 to 181,you have to call grep 181-37+1=145 times. there is a way not to use for loop, as grep has already internal ability to loop over files.
Code:
grep -E "THE NUMBER (.....)" mytable
where the (....) part looks something like (37|38|39|....|181) (regular expression). I leave it to you to figure out if you are interested.

Alternatively, since you know Python, here's a Python script
Code:
#!/usr/bin/env python
import re
f={}
r = '|'.join( map(str,range(37,182) ))
print "THE NUMBER (%s)" % r
pat = re.compile("THE NUMBER (%s)" % r,re.M|re.DOTALL)
data=open("mytable").read()
for i in pat.findall(data):
    f.setdefault(i,0)
    f[i]+=1
print f
 
Old 05-05-2009, 10:18 AM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Your script in C-shell should be something like
Code:
#!/bin/csh
set i = 37
while ( $i <= 182 )
     echo -n "$i "
     grep -c "THE NUMBER $i " myfilename.tbl
     @ i++
end
Regarding the problem to find the reciprocal of the strings, you can try the rev command. For example using /bin/bash you can try something like this (just a template):
Code:
#!/bin/bash
while read line
do
  if grep -q "$(echo $line | rev)" testfile
  then
    echo $line has reverse
  else
    echo $line has not reverse
  fi
done < testfile
If your system has bash installed, you can stick with that, unless you want to port the script on other systems which does not have bash. If this is the case, you can use a strict /bin/sh syntax.

Last edited by colucix; 05-05-2009 at 10:19 AM.
 
Old 05-05-2009, 06:48 PM   #6
adelaide98
LQ Newbie
 
Registered: May 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Thanks for all your help, ghostdog74 and colucix!

ghostdog74, I don't quite understand your grep -E command because without some kind of loop, how would I end up with a 2 column output (first column = the number I'm looking for, second column is a count of the number of occurences). Since I have to have a loop in order to create the first column, why not use that same loop to input the number that is going to be counted?

Thanks though for the python script. I tried it and have not quite gotten it working yet (because my actual data table contains things a little more complicated than "MY NUMBER", and I'm still trying to get all the expressions and syntax correct). I do observe that to make python do the same thing as a csh or bash script, it takes a lot more code. I actually don't know python very well, it's just the language that several other programs that I'm using are written in.

colucix, thanks for the csh while script. It was satisfying to finally get that solution to work (very small syntax problems stop everything!). Thanks also for your rev script. I look forward to trying it.

This has been a good learning experience. I am now firmly committed to ditching Excel for this phase of my project and to learning bash. Seems like it will be an easier learning curve than python or perl and may serve all my needs.

I really appreciate the expertise here!

Andrea
 
Old 05-05-2009, 07:39 PM   #7
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
for bash these are good tutorials:
http://tldp.org/LDP/Bash-Beginners-G...tml/index.html
http://www.tldp.org/LDP/abs/html/ (advanced guide)

In re Perl or Python, they are definitely more powerful/sophisticated langs and you may well find yourself needing them eventually.
Personally I prefer Perl (http://perldoc.perl.org/), but since you are using some Python stuff already, might make more sense to pick that one when you get to that point.
 
Old 05-05-2009, 08:58 PM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by adelaide98 View Post

ghostdog74, I don't quite understand your grep -E command because without some kind of loop, how would I end up with a 2 column output (first column = the number I'm looking for, second column is a count of the number of occurences). Since I have to have a loop in order to create the first column, why not use that same loop to input the number that is going to be counted?
please check the man page of grep and look for -E option. I don't know what kind of output you want, so if it doesn't suit your needs, then don't use the solution i posted.

Quote:
Thanks though for the python script. I tried it and have not quite gotten it working yet (because my actual data table contains things a little more complicated than "MY NUMBER", and I'm still trying to get all the expressions and syntax correct).
then why didn't you post a correct sample of the file? As for the script i posted, this is the output
Code:
# more file
this is THE NUMBER 37 the end
this is THE NUMBER 29 the end
this is THE NUMBER 1 the end
this is THE NUMBER 100 the end
this is THE NUMBER 181 the end
this is THE NUMBER 8 the end
this is THE NUMBER 37 the end
this is THE NUMBER 29 the end
this is THE NUMBER 1 the end
this is THE NUMBER 34 the end
this is THE NUMBER 181 the end
this is THE NUMBER 8 the end
this is THE NUMBER 37 the end
this is THE NUMBER 29 the end
this is THE NUMBER 1 the end
this is THE NUMBER 100 the end
this is THE NUMBER 181 the end
this is THE NUMBER 8 the end
output:
Code:
# ./test.py
{'100': 2, '181': 3, '37': 3}
there is 2 occurrence of 100, 3 for 181 and 3 for 37.

Quote:
I do observe that to make python do the same thing as a csh or bash script, it takes a lot more code.
does it matter? you are not going for obfuscation competition or "who has the least number of lines of code" competition right? what's more important is you understand the code you are maintaining.

Last edited by ghostdog74; 05-05-2009 at 09:05 PM.
 
Old 05-05-2009, 11:55 PM   #9
mpiekarski
LQ Newbie
 
Registered: May 2009
Location: Newark, DE
Distribution: Gentoo,ubuntu,rhel
Posts: 25

Rep: Reputation: 16
In order to use numbers / variables as numbers and not strings in bash, you have to encapsule it properly.

For example, the following script:

#!/bin/bash
X=5
Y=10
Z=7
VALUE=$((X*Y+Z))
echo $VALUE

Will return 57. There is nothing wrong with trying to learn new things or doing something in a different language even if you known another... thats actually the only way to learn :P

------------------------------------
Michael Piekarski
Network Engineer
mpiekarski@hostmysite.com
www.hostmysite.com
 
Old 05-11-2009, 02:25 PM   #10
adelaide98
LQ Newbie
 
Registered: May 2009
Posts: 13

Original Poster
Rep: Reputation: 0
more on identifying reciprocals within 2 columns

Thanks to everyone for your help.
My original description of my reciprocal search was incomplete. My columns actually contain whole strings rather than single letters:

Code:
%: cat animals
column A  column B
dog       cat
hamster   dog
cat       dog
wolf      pig
pig       hamster
dog       hamster
in this example, i would want to mark that "dog cat", "cat dog", "hamster dog" and "dog hamster" have reciprocals, and ideally be able to form a new list that deletes one of each pair. (e.g. deletes the second occurrence of each pair, or "cat dog" and "dog hamster")

The rev command suggested by colucix would not work in my case because the whole line is reversed (i.e. dog cat → tac god). I have managed to make a list of the reciprocals that are in my file using the following two commands:

Code:
#!/bin/bash
awk ‘{print$2,$1 > “reverseanimals”}’ animals
grep –xf reverseanimals animals > recips_only
I have some questions:

1. Any thoughts on how to use a rev command or something similar to accomplish the labeling of each line as colucix did for the letters example?
2. I was thinking that something like the rev command could work if the whole animal name (called a “string”, right?) were considered a single character…any way to do that?
3. Any thoughts on how I might delete the second occurrence of each reciprocal?

Thanks very much. I’m learning a lot!
Andrea
 
Old 05-11-2009, 10:44 PM   #11
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
Perl:
Code:
#!/usr/bin/perl -w
use strict;             # Enforce declarations

my (
    %t_hash, $key, $var1, $var2,
    $file, $rec,
    %txt_pairs
   );

$file="test.txt";
open( TXT_FILE, "<$file" ) or
            die "Can't open txt file: $file: $!\n";
while ( defined ( $rec = <TXT_FILE> ) )
{
    # Remove unwanted chars
    chomp $rec;                 # newline
    $rec =~ s/^\s+//;           # leading whitespace
    $rec =~ s/\s+$//;           # trailing whitespace

    next unless length($rec);   # anything left?

    # Split 'key value' string
    ($var1, $var2) = split( /\s+/, $rec, 2);

    # Assign to global hash, forcing uppercase keys
    $txt_pairs{$var1.':'.$var2} = 1;
}
close(TXT_FILE) or
            die "Can't close txt file: $file: $!\n";
%t_hash = %txt_pairs;
for $key (keys %txt_pairs )
{
    ($var1, $var2) = split( /:/, $key, 2);
    if( exists($t_hash{$var2.':'.$var1}) )
    {
        print "matched $key\n";
        delete($t_hash{$key});
    }
}
print "\n";

for $key ( keys %t_hash)
{
    ($var1, $var2) = split( /:/, $key, 2);
    print "$var1 $var2\n";
}
Output
Code:
matched hamster:dog
matched cat:dog

wolf pig
pig hamster
dog cat
dog hamster
 
Old 05-11-2009, 11:23 PM   #12
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by adelaide98 View Post
Thanks to everyone for your help.
My original description of my reciprocal search was incomplete. My columns actually contain whole strings rather than single letters:

Code:
%: cat animals
column A  column B
dog       cat
hamster   dog
cat       dog
wolf      pig
pig       hamster
dog       hamster
in this example, i would want to mark that "dog cat", "cat dog", "hamster dog" and "dog hamster" have reciprocals, and ideally be able to form a new list that deletes one of each pair. (e.g. deletes the second occurrence of each pair, or "cat dog" and "dog hamster")
alternative in Python
Code:
#!/usr/bin/env python
d={}
for line in open("file"):
    line=line.strip()    
    a,b=line.split()
    d.setdefault(a+b,"")   
    if b+a in d.keys():
        print "reciprocals: %s %s, %s %s"%(a,b, b,a)
    else:
        print line
output:
Code:
# ./test.py
dog       cat
hamster   dog
reciprocals: cat dog, dog cat
wolf      pig
pig       hamster
reciprocals: dog hamster, hamster dog
if you want to do it in awk, same concept. put the lines in associative arrays
Code:
awk '{
if( $2$1 in animals ){ print $2" "$1 }
else{ animals[$1$2]=""}
END{
 for(i in animals){ print "i: ",i }
}' file
 
Old 05-12-2009, 03:02 PM   #13
adelaide98
LQ Newbie
 
Registered: May 2009
Posts: 13

Original Poster
Rep: Reputation: 0
Thank you for the perl, python, and awk answers!!
I have been playing with the awk solution because that's what I've been playing with already for the past week. I moved some {} around, and here's what I got:

Code:
#!/bin/bash
awk '{
        if ($2$1 in animals)
                {print $2,$1}
        else {animals [$1$2]=""}}
END {
        for(i in animals) {print "i: ",i }
}' animals
Here's the output:
Code:
%animal.com
dog cat
hamster dog
i:  dogcat
i:  pighamster
i:  wolfpig
i:  columnAcolumnB
i:  hamsterdog
I've been reading all about associative arrays on the internet, but I find myself getting confused, and I do have some questions:

1. The current output prints out the whole index (did I get that right?) without a space in between the 2 strings. I would love to have the output with a space between. Any way to do this?

2. Just to verify: I previously had a file called "animals" with the pairs of animals in it. Is it correct that the array does not become defined until the else statement? (Seems weird to me because the "else statement" does not appear until after the "if statement", but this is how others seem to have done it, and I wanted to make sure I understand all the pieces.)

3. I don't really understand how the extra "reciprocals" get deleted. (which is exactly what I was hoping for, but I don't understand).

4. If I substitute anything between the quotes in the else statement, it does not seem to affect the script at all. Therefore what is this for? e.g.

Code:
else {animals [$1$2]=""}
else {animals [$1$2]="goose"}
else {animals [$1$2]="55"}
all have no effect on the output of this script.

5. Maybe I'm really close, or maybe I'm missing the points entirely. Here is my interpretation of the script:
The "else statement" defines the array as having $1$2 be the index, I guess there are no elements.

The "if statement" looks to see if any of the index values are actually $2$1, in which case they are printed out (with a space in between).

The "for statement" defines i as a variable in animals, and then prints them line by line. ---> so how does the for statement tell that some of the lines will not get printed?
Many many thanks in advance!
 
Old 05-12-2009, 07:31 PM   #14
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,240

Rep: Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324Reputation: 2324
Well, I'm no awk expert, but here's good tutorial/guide: http://www.grymoire.com/Unix/Awk.html
 
Old 05-12-2009, 08:53 PM   #15
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by adelaide98 View Post

1. The current output prints out the whole index (did I get that right?) without a space in between the 2 strings. I would love to have the output with a space between. Any way to do this?
add a space between $1 and $2

Quote:
2. Just to verify: I previously had a file called "animals" with the pairs of animals in it. Is it correct that the array does not become defined until the else statement?
in a way, yes.

Quote:
3. I don't really understand how the extra "reciprocals" get deleted. (which is exactly what I was hoping for, but I don't understand).
because you are using associative arrays with $1 and $2 as index. any line with same "index" will get stored and "overwritten".

Quote:
4. If I substitute anything between the quotes in the else statement, it does not seem to affect the script at all. Therefore what is this for? e.g.

Code:
else {animals [$1$2]=""}
else {animals [$1$2]="goose"}
else {animals [$1$2]="55"}
the only difference is when you want to use the values (right hand side) as well, otherwise, the script i use only use the index.


not much of an "explainer", so for awk guide, see GNU gawk user manual
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Increasing the number of concurrent imap connections allowed from any one IP address rsleventhal Red Hat 12 07-18-2008 11:36 AM
Is the number of PC with Linux installed now increasing exponentially ? frenchn00b General 7 04-08-2008 11:27 PM
increasing the number of file descriptors on RHEL8 mingram27 Fedora 1 02-21-2007 01:41 PM
Shell-script question about getting a number as a variable stormrider_may Programming 9 03-14-2006 09:18 AM
Increasing Number of message queues systemwide raees Linux - General 1 01-09-2004 03:25 AM


All times are GMT -5. The time now is 03:28 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration