detect shell script language
Hi
I need to detect the actual programming language of a script. A way of detecting it is to examine the first line searching for the "sha-bang" (#!), e.g., #!/bin/bash or #!/usr/bin/perl However, there are cases where this is not enough, since the script, although it has #!/bin/sh is actually written (and interpreted) in another language, e.g., Tcl. So my question is, is there another way of detecting the actual language? I mean, another convention? Another guy told me that a possible way is to use, on the second line a pattern like -*- <interpreter> -*- and I found some tcl scripts that look like this. Is there a specific standard convention? many thanks in advance |
Hi,
Did you take a look at the file command? Example: cat a01 a02 a03 #!/bin/bash #!/usr/bin/perl #!/bin/csh file a01 a02 a03 a01: Bourne-Again shell script text executable a02: perl script text executable a03: C shell script text executable I don't know if this solves your tcl/interpreter problem, but you can give it a try. Hope this helps. |
Quote:
Quote:
Also, a script can be written such a way it is compatible with more than one language. Quote:
|
Quote:
so I can only use some heuristics (like the -*-), that are not ensured to work everytime... I need these hueristics to determine the source language to use the correct highlighting in the software I maintain, GNU Source-highlight http://www.gnu.org/software/src-highlite/ I noticed that emacs somehow detects pretty correctly the language (without extension and with a "wrong" shabang, so I'll to check what it's doing. Lorenzo |
Hi.
This is a question that I have considered from time to time. If there could be a single syntax for calling, say a standard script, and getting the result, one could look at processes, and try to get (at least one parent) that was the process responsible for the current (part of a) script. I have not had much luck. To depend on the sh-bang, internal comments, etc., is unreliable, but probably better than nothing ... cheers, makyo |
Quote:
http://www.phys.ufl.edu/docs/emacs/emacs_201.html |
Quote:
Code:
#!/bin/sh |
Hi.
My recollection is that statements like: Code:
exec wish "$0" -- "$@" You can try (adjusting the path for your system): Code:
If you create a Tcl script in a file whose first line is If that can be done, then file will be able to report something useful ... cheers, makyo |
actually, as I said above, I need this recognition for a program that highlights the syntax of other programs, which I don't write myself...
|
Hi, Lorenzo.
Quote:
Quote:
I looked at the src-highlight web page -- s.h. seems to know a lot about a lot of languages. The very flexibility of scripts being semi-structured is the cause of us not being able to easily tell the language, especially when you can have the shell call tclsh, awk, perl, etc. I don't see any way other than looking for key strings, pieces of syntax, etc., that are peculiar to one language or another. You could allow the caller to specify the language if you cannot guess it correctly. I like that kind of work, and I have had a bit of exposure to parsing -- long ago I wrote a SNOBOL program that converted PL/1 into Fortran. Lately, I used a parser that filtered language elements with the use of a truth table -- one looked at a token, checked the table, and the result would exclude a number of possibilities, then the table entry might link to another entry, etc. That was part of a restructurizer, a pretty-printer. I think the University of Colorado was known for work like that. Fordham has also produced some tools along that line. However, I don't know of any specific place these days. Interesting problem, and I'll keep an eye on src-highlight to track the progress. Best wishes ... cheers, makyo |
Quote:
however I'm adding more language there are still many others out there :D Quote:
I've started to add "language inference" in the current development version, and it's pretty useful for highlighting, e.g., an entire directory, or when used in less http://www.gnu.org/software/src-high...ight-with-less Quote:
Quote:
|
All times are GMT -5. The time now is 03:18 AM. |