LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-01-2013, 01:02 PM   #1
apottere
LQ Newbie
 
Registered: Jul 2013
Posts: 15

Rep: Reputation: Disabled
Bash/awk extract function from file.


I'm trying to write a one liner that can extract a certain bash function from a file. This code will extract the function test() from test.sh:

Code:
cat test.sh | awk '/test()/ {start=1} /{/ {if (start == 1) { line=$0; while(sub("{", "", line)) brackets++}} /}/ {if (start == 1) { line2=$0; while(sub("}", "", line2)) brackets--}} {if (start == 1) if (brackets > 0) print $0; else { print; exit}}'
This uses awk to balance parentheses after the function name is found, and quits when they're balanced. Is there a prettier way of doing this? It works perfectly fine, I just didn't know if there was something "better" that I overlooked.

Edit: "perfectly fine" is not necessarily correct, commented brackets will still get counted.

Last edited by apottere; 08-01-2013 at 01:03 PM.
 
Old 08-01-2013, 03:44 PM   #2
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,604

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
Welll... it won't handle the case where the brace ("{") may be inside quotes...
 
Old 08-01-2013, 03:55 PM   #3
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
personally I use sed for things like this

Code:
sed '/test()/,/^}$/!d' Infile
 
Old 08-01-2013, 06:58 PM   #4
apottere
LQ Newbie
 
Registered: Jul 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Quote:
Code:
sed '/test()/,/^}$/!d' Infile
This doesn't do anything, and what it looks like it's trying to do is just as buggy as the original. What if there's a bracket on it's own line halfway through the function?

Quote:
Welll... it won't handle the case where the brace ("{") may be inside quotes...
Yeah, this too. I just can't think of anything better.
 
Old 08-01-2013, 09:46 PM   #5
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
Quote:
Originally Posted by apottere View Post
This doesn't do anything, and what it looks like it's trying to do is just as buggy as the original. What if there's a bracket on it's own line halfway through the function?


Yeah, this too. I just can't think of anything better.
???

Code:
mkdir Wtf
cd Wtf
cat > test.sh << "EOF"
#!/bin/bash
function test() {
echo foo
}

test2(){
echo muppet
}
echo bar
EOF
sed '/test()/,/^}$/!d' test.sh
works just fine
 
Old 08-02-2013, 02:24 AM   #6
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
@Firerat: That won't work with functions like
Code:
function x {
    {
        true
        false
    } >/path/file.xyz
}
And can't be consistent with one-line functions:
Code:
function x { true; }
I think a real parser that would recognize general shell-scripting syntax and read the whole script consistently would be a good a solution to this. Try other scripting languages perhaps. I favor Ruby.

Edit: Oh yes your code could work with functions made of regular format but some do this:
Code:
function x {
{
    true
    false
}
}
And some do this as well:
Code:
if tests; do
    # create this version of a function
    function {
        echo a
    }
else
    # do this instead
    function {
        echo b
    }
fi
Even this:
Code:
function x {
    xyz << EOF
{
some texts
}
around
EOF
}

Last edited by konsolebox; 08-02-2013 at 02:31 AM.
 
Old 08-02-2013, 03:44 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
From a language exclusive point of view I would suggest counting the braces and once you have a match for left and right count, assuming the code has a working function without errors,
you should be at the end of your function.

As konsolebox has pointed out, there are several different formats you may need to contend with to have the count correct
 
Old 08-02-2013, 04:35 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
So had a quick think about it, this is the sort of thing I would look at:

Code:
ruby -ne 'BEGIN{l=r=0};if $_ =~ /test/; until l > 0 && l == r; l+=$_.scan(/{/).size;r+=$_.scan(/}/).size;print $_;gets;end;break;end' file
 
Old 08-02-2013, 05:41 AM   #9
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
Quote:
Originally Posted by konsolebox View Post
@Firerat: That won't work with functions like
..
ahh, I see now
 
Old 08-02-2013, 09:39 AM   #10
apottere
LQ Newbie
 
Registered: Jul 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Fixed my original with awk:

Code:
cat test.sh | awk -v sq="'" '/test()/ {line=$0; gsub("\"[^\"]*\"", "", line); gsub(sq"[^"sq"]*"sq, "", line); gsub("#.*$", "", line); if( sub("test()", "", line) ) {start=1; brackets=0}} /function test/ {line=$0; gsub("\"[^\"]*\"", "", line); gsub(sq"[^"sq"]*"sq, "", line); gsub("#.*$", "", line); if( sub("test()", "", line) ) {start=1; brackets=0}} /{/ {if (start == 1) { line=$0; gsub("\"[^\"]*\"", "", line); gsub(sq"[^"sq"]*"sq, "", line); gsub("#.*$", "", line); while(sub("{", "", line)) { brackets++ }}} /}/ {if (start == 1) { line2=$0; gsub("\"[^\"]*\"", "", line); gsub(sq"[^"sq"]*"sq, "", line); gsub("#.*$", "", line); while(sub("}", "", line2)) brackets--}} {if (start == 1) if (brackets > 0) print $0" - "brackets; else { print $0" - "brackets" - EXITING"; exit}}'

I think
Code:
gsub("\"[^\"]*\"", "", line); gsub(sq"[^"sq"]*"sq, "", line); gsub("#.*$", "", line);
could be functionized, but this will strip out all strings and then all comments before checking for the function name or brackets. I've tried it with all the tests here and it seems to work
 
Old 08-02-2013, 09:51 AM   #11
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,604

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
That should work nearly all the time. The most unusual cases will be where the braces are escaped (happens with find commands, and xargs).
 
Old 08-02-2013, 09:55 AM   #12
apottere
LQ Newbie
 
Registered: Jul 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
I'll code this out later today, but would:

1. Removing everything inside backticks and $()
2. Removing all occurences of \{ and \}

cover all cases? Is there any other way to escape a bracket?

Edit: I'll also have to remove escaped quotes before I remove strings, too.

Last edited by apottere; 08-02-2013 at 09:57 AM.
 
Old 08-02-2013, 10:18 AM   #13
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,604

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
Only remaining problem would be typeos in the file... These could cause problems with counting (missing quote, done keyword, close paren...). These don't HAVE to be detected by the shell defining the functions... until the function is actually invoked. But they could cause problems for the scanner.

BTW, as for other ways of escaping--- yes. "here" documents are another...

Last edited by jpollard; 08-02-2013 at 10:20 AM.
 
Old 08-02-2013, 11:17 AM   #14
apottere
LQ Newbie
 
Registered: Jul 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Dang, forgot about those. Also, I'm not going to worry about typos, if it's broken then it'll break hard. Heredocs on the other hand...

Edit: If there's no other way to escape from a heredoc other than the specified string on a line by itself, it may not be that bad after all.

Last edited by apottere; 08-02-2013 at 11:23 AM.
 
Old 08-02-2013, 11:37 AM   #15
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,248
Blog Entries: 8

Rep: Reputation: 235Reputation: 235Reputation: 235
You should also consider that awk process texts generally line by line and not character by character unlike other languages. Example is my attempt to parse scripts like this. It works and is made complex, but is still limited.

compiler.gawk
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How can I extract columns from a file without using awk or perl? KG425 Programming 13 06-06-2012 12:40 PM
BASH or AWK: extract columns in multiple files and combine to a single file cristalp Programming 2 03-15-2012 12:55 PM
How to extract lines from file using AWK keenboy Linux - General 7 08-05-2010 09:29 AM
using awk substring function on a file in a bash script matt007 Programming 3 06-17-2008 09:17 PM
Getting awk to extract scripts from a file jspaceman Programming 5 11-24-2002 07:37 PM


All times are GMT -5. The time now is 07:47 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration