LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-06-2004, 08:16 AM   #1
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Rep: Reputation: 32
Need help with perl/bash script to parse PicBasic file


I have recently been assigned to a project that uses a pic programmed with PicBasic. The code is nearly 11,000 lines long and is poorly, poorly written and structured. I am trying to disect the file to get a better understanding of the code.
I perfordmed the following op to get all line labels and subroutine labels and their corresponding line number and put them in their own file:
Code:
[cfarley@wombat hercules]$ cat herc.bas |grep -n :|grep -v -i lcd > subs.txt
That file looks like this:
Code:
245:START:
266:PU_LOOP:
287:WAIT_STEP_HOME:
292:LOAD_INFOPAC:
303:GET_REMAINING_COATS:
349:GET_REMAINING_WASHES:
422:CNT_LOAD:
452:DISPLAY_ERROR_CODE:
471:GOOD_STATUS:
492:DISPLAY_M:
499:NXT_LP1:
...
...
...
10805:getkeyp:
10818:gotkey: ' Change row and column to key number 0 - 15
10833:LOOK_KEY:
10862:HOME_SW:
10865:NEXT_SW:
10868:ALTER_SW:
10871:MENU_SW:
10875:START_SW:
10878:STOP_SW:
10881:UP_SW:
10884:DOWN_SW:
10887:CLear_home:
What I would like to do next is sort that list in alphabetical order by label name (case insensitive).

Any tips?
 
Old 10-06-2004, 08:33 AM   #2
jmings
Member
 
Registered: Sep 2004
Location: Hemet,California, USA
Distribution: SimplyMEPIS
Posts: 31

Rep: Reputation: 16
See the man pageon sort, but what you want is:
wizard@2[~]$ vi test.dat
wizard@2[~]$ sort -f -i -t: +1 test.dat
10868:ALTER_SW:
10887:CLear_home:
422:CNT_LOAD:
452:DISPLAY_ERROR_CODE:
492:DISPLAY_M:
10884:DOWN_SW:
10805:getkeyp:
303:GET_REMAINING_COATS:
349:GET_REMAINING_WASHES:
471:GOOD_STATUS:
10818:gotkey: ' Change row and column to key number 0 - 15
10862:HOME_SW:
292:LOAD_INFOPAC:
10833:LOOK_KEY:
10871:MENU_SW:
10865:NEXT_SW:
499:NXT_LP1:
266:PU_LOOP:
245:START:
10875:START_SW:
10878:STOP_SW:
10881:UP_SW:
287:WAIT_STEP_HOME:
wizard@2[~]$

If you want to replace the original file, add "-o test.dat" to the command.
 
Old 10-06-2004, 11:21 AM   #3
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
Excellent. That did the job.
Now for my next feat...

I have a file (sub_names_alpha.txt) that contains the list of line numbers and sorted label names. It is in the form of
N LABEL:
Code:
3692 TEMP_TOO_HI:
3695 TEMP_TOO_LOW:
7781 TEST_CHECKSUM:
4924 TEST_CYCLES_CNT:
3704 test_m:
10200 TEST_VACS:  'DEACTIVATE ANY DETECTED VAC STA OR ENABLE ANY NON DETECTED
1826 TM_OUT1:
2406 TM_OUT2:
2985 TM_OUT3:
3564 TM_OUT4:
...
...
...
5288 vacuumb:
5296 VACUUM_LP:
10463 vacuum_on:
5039 versions:
8500 VIBRATE:
8492 VIBRATOR_OFF:
8484 VIBRATOR_ON:
9951 WAIT_CARO_DN:
Note the line with the trailing comment. There are several lines in the file that have trailing comments
I want to remove all trailing comments.

then...
I have another file (subcalls_alpha.txt) that contains a similar list of all call to subroutines (GoSub's)
Code:
3734 GoSub VAC4_ON
9023 GoSub VAC4_ON
4720 GoSub VACUUM1_ON
4723 GoSub VACUUM2_ON
4726 GoSub VACUUM3_ON
4729 GoSub VACUUM4_ON
1020 GoSub VIBRATE
3405 GoSub VIBRATE
3712 GoSub VIBRATE
1330 GoSub VIBRATOR_OFF
1537 GoSub VIBRATOR_OFF
1857 GoSub VIBRATOR_OFF
2060 GoSub VIBRATOR_OFF
2382 GoSub VIBRATOR_OFF
I want to look at each sub label in sub_names_alpha.txt and record to a new file at what line number the subroutine is called.
For example if when I find the sub name VIBRATE in sub_names_alpha.txt I look it up in subcalls_alpha.txt and see it is called from lines 1020, 3405, 3712. Ithen want to write out a line in the following format to a new file

Code:
8500 VIBRATE: 1020 3405 3712
Any hints are appreciated!!!

I can post the entire files if anyone would like. They are pretty big though.

Last edited by cmfarley19; 10-06-2004 at 12:34 PM.
 
Old 10-06-2004, 01:19 PM   #4
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
New addition...
I have yet a new file that contains all of the goto calls (goto_calls.txt)
Code:
10597 IF TEMP1.1 = 0 Then GoTo S1_LOW
10599 GoTo DS0
10603 IF TEMP1.0 = 0 Then GoTo S0_LOW
10605 GoTo CK_FOR_ADDR0
10612 GoTo SPEED_LOOP
10769 GoTo  EESAVE
10838 IF PORTB.4 = 0 Then GoTo HOME_SW
10839 IF PORTB.5 = 0 Then GoTo NEXT_SW
10842 IF PORTB.6 = 0 Then GoTo ALTER_SW
10845 IF PORTB.7 = 0 Then GoTo MENU_SW
10850 IF PORTB.4 = 0 Then GoTo START_SW
I want to do the same referencing as above, but first I want to remove any text between the line number and the word GoTo.

Thoughts?
 
Old 10-06-2004, 03:23 PM   #5
jmings
Member
 
Registered: Sep 2004
Location: Hemet,California, USA
Distribution: SimplyMEPIS
Posts: 31

Rep: Reputation: 16
Quote:
Originally posted by cmfarley19
Excellent. That did the job.
Now for my next feat...

I have a file (sub_names_alpha.txt) that contains the list of line numbers and sorted label names. It is in the form of
N LABEL:
Code:
3692 TEMP_TOO_HI:
3695 TEMP_TOO_LOW:
7781 TEST_CHECKSUM:
4924 TEST_CYCLES_CNT:
3704 test_m:
10200 TEST_VACS:  'DEACTIVATE ANY DETECTED VAC STA OR ENABLE ANY NON DETECTED
1826 TM_OUT1:
2406 TM_OUT2:
2985 TM_OUT3:
3564 TM_OUT4:
...
...
...
5288 vacuumb:
5296 VACUUM_LP:
10463 vacuum_on:
5039 versions:
8500 VIBRATE:
8492 VIBRATOR_OFF:
8484 VIBRATOR_ON:
9951 WAIT_CARO_DN:
Note the line with the trailing comment. There are several lines in the file that have trailing comments
I want to remove all trailing comments.

sed "s/ *'.*//" <sub_names_alpha.txt >sub_names_alpha.out
Quote:
then...
I have another file (subcalls_alpha.txt) that contains a similar list of all call to subroutines (GoSub's)
Code:
3734 GoSub VAC4_ON
9023 GoSub VAC4_ON
4720 GoSub VACUUM1_ON
4723 GoSub VACUUM2_ON
4726 GoSub VACUUM3_ON
4729 GoSub VACUUM4_ON
1020 GoSub VIBRATE
3405 GoSub VIBRATE
3712 GoSub VIBRATE
1330 GoSub VIBRATOR_OFF
1537 GoSub VIBRATOR_OFF
1857 GoSub VIBRATOR_OFF
2060 GoSub VIBRATOR_OFF
2382 GoSub VIBRATOR_OFF
I want to look at each sub label in sub_names_alpha.txt and record to a new file at what line number the subroutine is called.
For example if when I find the sub name VIBRATE in sub_names_alpha.txt I look it up in subcalls_alpha.txt and see it is called from lines 1020, 3405, 3712. Ithen want to write out a line in the following format to a new file

Code:
8500 VIBRATE: 1020 3405 3712
Code:
wizard@5[~]$ cat sub_names_alpha.txt
3734 GoSub VAC4_ON
9023 GoSub VAC4_ON
4720 GoSub VACUUM1_ON
4723 GoSub VACUUM2_ON
4726 GoSub VACUUM3_ON
4729 GoSub VACUUM4_ON
1020 GoSub VIBRATE
3405 GoSub VIBRATE
3712 GoSub VIBRATE
1330 GoSub VIBRATOR_OFF
1537 GoSub VIBRATOR_OFF
1857 GoSub VIBRATOR_OFF
2060 GoSub VIBRATOR_OFF
2382 GoSub VIBRATOR_OFF
wizard@5[~]$ cat sub_names_alpha.sh
#!/bin/bash
# clear out file
>sub_names_alpha.out
cat sub_names_alpha.txt | sed 's/\[.*\]'// | cut -d' ' -f2- | sort -f | uniq >sub_names_alpha.tmp
cat sub_names_alpha.tmp | \
while true
do
  read SUB
  if [ $? -ne 0 ]; then break; fi
  LINES=`grep " $SUB" sub_names_alpha.txt | cut -d' ' -f1 '`
  echo $SUB: $LINES>>sub_names_alpha.out
done
rm -fv cat sub_names_alpha.tmp
echo "Results in sub_names_alpha.out"



wizard@5[~]$ sh sub_names_alpha.sh
removed `sub_names_alpha.tmp'
Results in sub_names_alpha.out
wizard@5[~]$ cat sub_names_alpha.out
GoSub VAC4_ON: 3734 9023
GoSub VACUUM1_ON: 4720
GoSub VACUUM2_ON: 4723
GoSub VACUUM3_ON: 4726
GoSub VACUUM4_ON: 4729
GoSub VIBRATE: 1020 3405 3712
GoSub VIBRATOR_OFF: 1330 1537 1857 2060 2382
wizard@5[~]$
Quote:

Any hints are appreciated!!!
Read the code and look up anything you don't understand.
Quote:
I can post the entire files if anyone would like. They are pretty big though.
 
Old 10-06-2004, 03:27 PM   #6
jmings
Member
 
Registered: Sep 2004
Location: Hemet,California, USA
Distribution: SimplyMEPIS
Posts: 31

Rep: Reputation: 16
Pipe it through
sed 's/ .*GoTo/GoTo/'
 
Old 10-06-2004, 03:29 PM   #7
jmings
Member
 
Registered: Sep 2004
Location: Hemet,California, USA
Distribution: SimplyMEPIS
Posts: 31

Rep: Reputation: 16
Post #6 was in reference to post #4 above.
 
Old 10-07-2004, 08:54 AM   #8
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
jmings:
Very grateful. Thank you.

I have been playing with this most of the morning and I have hit a snag.

In refernce to post 5:
I have modified your script (thank you) a bit to accomodate some file discrepancies I had.
It now looks like this:
Code:
[cfarley@wombat tmp]$ cat subs.sh
#!/bin/bash
# clear out file
>subs.out
cat sub_names_master.txt | sed 's/\[.*\]'// | cut -d' ' -f2- | cut -d: -f1 | sort -f | uniq > subs.tmp
cat subs.tmp | \
while true
do
  read SUB
  if [ $? -ne 0 ]; then break; fi
#  It seems to be choking on this line
#  LINES=`grep " $SUB" subcalls_alpha.txt | cut -d' ' -f1 '`
  LINES=`cat subcalls_alpha.txt | grep " $SUB" | cut -d' ' -f1 '`
  echo $SUB: $LINES>>subs.out
done
rm -fv cat subs.tmp
echo "Results in subs.out"
I am getting this error for each iteration of the loop:
Code:
./subs.sh: command substitution: line 1: unexpected EOF while looking for matching `''
./subs.sh: command substitution: line 2: syntax error: unexpected end of file
I get the same error regardless of what version of the "grep" line above is used.

Any thoughts?
 
Old 10-07-2004, 12:08 PM   #9
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
OK.

A little searching and I figured that one out.
 
Old 11-18-2004, 07:44 AM   #10
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
Back with another question...

I need help figuring out the regex for this. Should be simple but I'm not getting it. I need a sed expression the will filter out lines where the first non-whitespace character is a comment ( ' ) .
Ex:
Code:
  326  SET_VALUES:                                     ' IF NOT LOAD DEFUALT
327     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON 
328     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON 
329     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON 
330     IF tone = 6 Then  GoTo  DRAWER_M                'UP BUTTON
331     'IF tone = 2 THEN  GOTO  DRAWER_M        'NEXT BUTTON 
332     'IF tone = 2 THEN  GOTO  DRAWER_M             'NEXT BUTTON 
333     IF tone = 13 Then  GoTo DRAWER_M         'DWN BUTTON
334     'IF tone = 13 THEN  GOTO drawer_m        'DWN BUTTON
So filter out lines 327, 328, 329, 331, 332, 334 and output lines 326, 330, 333
Code:
cat -n file.bas | sed <????>
I just realised as I was typing this that it will need to ignore the line numbers. So restated the sed statement should filter out lines where the first non-whitespace character after the line number is a comment ( ' ) .
 
Old 11-18-2004, 09:01 AM   #11
jmings
Member
 
Registered: Sep 2004
Location: Hemet,California, USA
Distribution: SimplyMEPIS
Posts: 31

Rep: Reputation: 16
Exclamation

Quote:
Originally posted by cmfarley19
Back with another question...

I need help figuring out the regex for this. Should be simple but I'm not getting it. I need a sed expression the will filter out lines where the first non-whitespace character is a comment ( ' ) .
Ex:
Code:
  326  SET_VALUES:                                     ' IF NOT LOAD DEFUALT
327     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON 
328     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON 
329     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON 
330     IF tone = 6 Then  GoTo  DRAWER_M                'UP BUTTON
331     'IF tone = 2 THEN  GOTO  DRAWER_M        'NEXT BUTTON 
332     'IF tone = 2 THEN  GOTO  DRAWER_M             'NEXT BUTTON 
333     IF tone = 13 Then  GoTo DRAWER_M         'DWN BUTTON
334     'IF tone = 13 THEN  GOTO drawer_m        'DWN BUTTON
So filter out lines 327, 328, 329, 331, 332, 334 and output lines 326, 330, 333
Code:
cat -n file.bas | sed <????>
I just realised as I was typing this that it will need to ignore the line numbers. So restated the sed statement should filter out lines where the first non-whitespace character after the line number is a comment ( ' ) .
I don't understand why the '-n' in the cat ... now you've got 2 sets of numbers. Also, sed doesn't filter out lines, grep does...
Code:
wizard@2[~]$ cat file.bas
327     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON
328     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON
329     'IF tone = 2 THEN  GOTO  DRAWER_M            'NEXT BUTTON
330     IF tone = 6 Then  GoTo  DRAWER_M                'UP BUTTON
331     'IF tone = 2 THEN  GOTO  DRAWER_M        'NEXT BUTTON
332     'IF tone = 2 THEN  GOTO  DRAWER_M             'NEXT BUTTON
333     IF tone = 13 Then  GoTo DRAWER_M         'DWN BUTTON
334     'IF tone = 13 THEN  GOTO drawer_m        'DWN BUTTON
wizard@2[~]$ cat file.sh
cat -n file.bas | grep -v "^[0-9        ]*'"
wizard@2[~]$ sh file.sh
     4  330     IF tone = 6 Then  GoTo  DRAWER_M                'UP BUTTON
     7  333     IF tone = 13 Then  GoTo DRAWER_M         'DWN BUTTON
wizard@2[~]$
What looks like several spaces in the grep regx is a space and a tab.


YMMD
 
Old 11-18-2004, 09:08 AM   #12
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
Another Update...
I have so far come up with this:
Code:
cat -n hrc_60.bas | sed "s/\d*\s*'.*//" | grep -iP "(draw|load)" | cat -n
This is ignoring the line number.
This removes ANY line with a comment.
I am tryinng to remove lines that start with comments (nonexecuted lines).
 
Old 11-18-2004, 09:26 AM   #13
cmfarley19
Member
 
Registered: Nov 2002
Location: Central VA
Distribution: Ubuntu/Debian
Posts: 228

Original Poster
Rep: Reputation: 32
jmings,
The basic file itself does not have line numbers. I am piping several commands to get my desired output.
I first cat the file with the -n parameter to output the file with line numbers. That way I know what line number to go to in my editior.
Code:
cat -n hrc_60.bas
I next want to only see the lines that contain the strings "draw" or "load" so I pipe it to:
Code:
grep -iP "(draw|load)"
That produces 429 lines. A bunch of those lines are commented out so I want to eleiminate commented out lines.
I do not, however, want to eliminate lines that contain trailing comments. (see post #10)

That's where my sed command comes in.

Lastly I pipe the whole mess to:
Code:
cat -n
to get a count of how many instances of "draw" or "load" occur.

So the whole thing looks something like:
Code:
cat -n hrc_60.bas | sed "s/\d*\s*'.*//" | grep -iP "(draw|load)" | cat -n
Does that make what I'm trying to do any clearer?
 
Old 11-18-2004, 05:06 PM   #14
jmings
Member
 
Registered: Sep 2004
Location: Hemet,California, USA
Distribution: SimplyMEPIS
Posts: 31

Rep: Reputation: 16
Use the grep regex in my previous post but move the cat -n to the end of it.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Parse String in a Bash script jimwelc Linux - Newbie 8 11-09-2012 07:47 AM
How to parse full path of file name in bash ? hq4ever Programming 2 03-28-2005 03:31 PM
bash script help to parse out text slack guy Linux - Newbie 3 12-30-2004 08:42 AM
optimizing perl parse file. eastsuse Programming 1 12-22-2004 02:49 AM
use php script to parse a file. blackzone Linux - Software 1 07-07-2004 04:43 AM


All times are GMT -5. The time now is 04:44 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration