extract string shell script

sunlinux · 07-12-2020, 08:02 AM

wpeckham · 07-12-2020, 08:15 AM

Cool. And easy. What have you tried so far?

shruggy · 07-12-2020, 08:42 AM

I'd suggest using awk, but any script language providing arrays (e.g. bash) will do it easily.

individual · 07-12-2020, 08:55 AM

There are a lot of ways to accomplish this. A couple have already been suggested, but I would add cut and sort. Check the man pages for each of those.

TB0ne · 07-12-2020, 10:53 AM

Quote:

Originally Posted by sunlinux

This thread has some great tips: https://www.linuxquestions.org/quest...le-4175598104/

...since you asked about extracting strings three years ago. And that thread contains links to several OTHER of your posts over the years, asking for scripts and similar things. Can't you apply what you've been told many times previously, and make it work for you?

You've been here FOURTEEN YEARS now, so you should be well familiar (especially since you've been told many times) with the "Question Guidelines" about doing your own research, and showing your own efforts.

teckk · 07-12-2020, 12:25 PM

Code:

list="
[1] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[2] http://live.bb.com/sys/orgs/org/?id=1342907248
[3] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
"
cut -d "=" -f2 <<< "$list"

1600810147
1342907248
1342908940

Look at:
man sed
man awk
man grep
man cut
man sort

shruggy · 07-12-2020, 12:35 PM

@teckk. That's not enough. As individual suggested above, the output of cut should be fed to sort -u. I'd also use the -s / --only-delimited option to cut here, just in case.

Besides,

Quote:

Originally Posted by sunlinux

I want to extract only values after jsp?id=

Please note that not all URLs from the top post include jsp, but this may be a misrepresentation on the part of OP.

So, to not repeat you here, the same using fex:

Code:

fex '//\.jsp\?id/=2' <<<"$list"|sort -u

MadeInGermany · 07-13-2020, 12:07 AM

With awk, split by search pattern and print the RHS if not yet seen

Code:

awk -F'\.jsp\?id=' 'NF>=2 && !($2 in s) { s[$2]; print $2 }'

shruggy · 07-13-2020, 01:49 AM

@MadeInGermany. Nice.

To sum it up,

Bash:

Code:

#!/bin/bash
while IFS='=' read -r url id
do [[ $url == *.jsp\?id ]] && [[ -n $id ]] && a["$id"]=
done <"$file"
printf %s\\n "${!a[@]}"

POSIX shell:

Code:

#!/bin/sh
set --
while IFS='=' read -r url id
do case $url in *.jsp\?id)
  [ -n "$id" ] && {
    new=true
    for i
    do [ "$i" -eq "$id" ] && { new=false; break;}
    done
    $new && set -- "$id" "$@"
  }
  esac
done <"$file"
IFS='
'; echo "$*"

pan64 · 07-13-2020, 02:07 AM

Code:

sort -t= -u -k2n $file | grep -oP '(?<=jsp\?id=)\d*$'

but post #8 is probably better

shruggy · 07-13-2020, 02:21 AM

Why not the other way round? Seems a bit easier to me:

Code:

grep -Po '\.jsp\?id=\K\d+' "$file"|sort -u

grail · 07-13-2020, 03:23 AM

If we can assume the data is as presented:

Code:

awk -F= '!_[$2]++{print $2}' file

shruggy · 07-13-2020, 03:37 AM

As I said in #7, not sure if the OP really meant what they said, but if yes then this should be

Code:

awk -F= '$1~/[.]jsp[?]id$/ && !_[$2]++, $0=$2' file

Skaperen · 07-14-2020, 07:18 PM

since this is a programming forum, i assume you are coding this instead of seeking a command line to do it. which language are you using? awk? bash? c? c++? go? java? lua? perl? python? rust? something else? what do you want to happen to lines without "jsp" (such as line 2)?

rnturn · 07-15-2020, 06:12 AM

Quote:

Originally Posted by sunlinux

Code:

cat file | grep '\.jsp' | cut -d= -f2 | sort | uniq

returns

Code: