How to parse strings in bash script
Hi Scripting Masters,
This is my first ever post in this forum. I just want to ask if, is there an easier way to parse string? say for example: ant-1.3.5-1 is the string. i want to parse this variable into this: artifactId=ant version=1.3.5-1 Note: The rule in this parser is that the string may change its length and value because it is located in a directory wherein there are also a lot of other artifacts in the directory. So the first thing that will come up to your mind is that you should create a looping statement wherein all artifacts will be placed in an array. Then, inside the loop is the parsing and assignment of the correct values to the correct fields. Thanks in advance! |
Per the LQ Rules, please do not post homework assignments verbatim. We're happy to assist if you have specific questions or have hit a stumbling point, however. Let us know what you've already tried and what references you have used (including class notes, books, and Google searches) and we'll do our best to help. Also, keep in mind that your instructor might also be an LQ member.
|
Sorry, posted a message as the mod posted his. Please ignore.
|
Sorry, this is not a homework. im doing this to migrate my maven 1 jars to maven 2 repository.
This is what i have done so far... #!/bin/bash dir=~/.maven/repository i=1 ctr=1 for file in $dir/*/jars/*.jar do jarfile[$i]=$file #echo ${jarfile[$i]} len=${#file} #replace '/' with a white space var=$(echo "${jarfile[$i]}" | tr '/' ' ') #echo $var #get the groupId gtemp=$(echo $var | awk '{print $5}') glength=$(echo -n $gtemp | wc -c) groupId=$(echo $gtemp | cut -c 1-$glength) echo $groupId #get the artifactId artemp=$(echo $var | awk '{print $7}') #echo $artemp arlength=$(echo -n $artemp | wc -c) artifactId=$(echo $artemp | cut -c 1-$((arlength-4))) #echo $artifactId #extract the version of the jar from the artifactId args=$(echo $artifactId | perl -lne '$c++ while /-/g; END {print $c; }') while [ $ctr -le $(expr $args + 1) ] do temp=$(echo $artifactId | cut -d'-' -f $ctr) numseries=$(echo $temp | sed -e 's/^[0-9]//') if [ -z $numseries ] then echo "null numseries" else if [ $temp != $numseries ] then echo "$temp is a number!" else tempartifact[$ctr]=$temp echo "artifact: ${tempartifact[$ctr]}" fi fi ctr=$(expr $ctr + 1) done echo "artifacts: ${tempartifact[*]}" #mvn install:install-file -Dfile=${jarfile[$i]} -DgroupId=$groupId -DartifactId=$artifactId -Dversion=$ver -Dpackaging=jar i=$(expr $i + 1) done |
if you have Python,
Code:
#!/usr/bin/env python |
Hmm, not homework yet you quote some explaining explaing the sort of thought process you should adopt when tackling it?? It's a pretty simple one liner in many forms, just a single bash substitution potentially - http://tldp.org/LDP/abs/html/string-manipulation.html Not sure if that will fit in with what the teacher wants though. :)
|
Quote:
|
Quote:
and the versions would also appear like this: jslt-13.4.5-84.1 |
Well ... as long as the "word" part of your artifactid
doesn't contain numbers, and the versions have no alpha components it's still trivial. Code:
[tink:~]$ echo -e "commons-logging-1.1.8\njslt-13.4.5-84.1" | sed -r 's/^([-a-z\.]+)-.+/\1/' If the conditions above DON'T apply I'd say you're screwed unless you have a dictionary of your IDs, since a parsing by lexicographic rules w/o language knowledge is impossible. cheers, Tink |
In other words, both atoms can contain dashes inside of them. Then you are going to have to filter by contents. For anything that's not basic string mangling you are going to have to use something more advanced, like awk or sed.
|
depending on how each package is named, i am going to assume that the first number encounter and after are the version numbers
Code:
#!/usr/bin/env python Code:
# ./test.py jslt-13.4.5-84.1 |
Thanks sirs!!! replies are all appreciated. thanks for all the ideas... im just making the code more dynamic...
|
Quote:
Sir, one last question. how about getting the version from the given artifact? what should i need to configure with the reg ex? |
Code:
echo -e "commons-logging-1.1.8\njslt-13.4.5-84.1" | sed -r 's/^([-a-z\.]+)-(.+)/\1 \2/'|while read arti vers;do echo $arti;echo $vers ;echo "";done |
Quote:
is it possible to use this without the while loop? the output should be like this: artifactId=commons-logging version=1.1.8 |
All times are GMT -5. The time now is 07:27 PM. |