LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-12-2020, 08:02 AM   #1
sunlinux
Member
 
Registered: Feb 2006
Distribution: RHCL 5
Posts: 239

Rep: Reputation: 30
extract string shell script


a file contains below entries

[1] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[2] http://live.bb.com/sys/orgs/org/?id=1342907248
[3] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[4] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[5] http://live.bb.com/sys/orgs/org/?id=1342907248
[6] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[7] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810122
[8] http://live.bb.com/sys/orgs/org/?id=1342907248
[9] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[10] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810122
[11] http://live.bb.com/sys/orgs/org/?id=1342907248
[12] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[13] http://live.bb.com/sys/diag/viewrun.jsp?id=1600809868
[14] http://live.bb.com/sys/orgs/org/?id=1342907248
[15] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[16] http://live.bb.com/sys/diag/viewrun.jsp?id=1600809955
[17] http://live.bb.com/sys/orgs/org/?id=1342907248

I want to extract only values after jsp?id= for example from 'jsp?id=1600809955' I want to extract 1600809955 and print it, if any duplicate entry remove while printing
 
Old 07-12-2020, 08:15 AM   #2
wpeckham
Senior Member
 
Registered: Apr 2010
Location: Continental USA
Distribution: Debian, Ubuntu, Fedora, RedHat, DSL, Puppy, CentOS, Knoppix, Mint-DE, Sparky, Vsido, tinycore, Q4OS
Posts: 3,432

Rep: Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496Reputation: 1496
Cool. And easy. What have you tried so far?
 
2 members found this post helpful.
Old 07-12-2020, 08:42 AM   #3
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 1,161

Rep: Reputation: Disabled
I'd suggest using awk, but any script language providing arrays (e.g. bash) will do it easily.
 
1 members found this post helpful.
Old 07-12-2020, 08:55 AM   #4
individual
Member
 
Registered: Jul 2018
Posts: 292
Blog Entries: 1

Rep: Reputation: 223Reputation: 223Reputation: 223
There are a lot of ways to accomplish this. A couple have already been suggested, but I would add cut and sort. Check the man pages for each of those.
 
Old 07-12-2020, 10:53 AM   #5
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 23,132

Rep: Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464Reputation: 6464
Quote:
Originally Posted by sunlinux View Post
a file contains below entries

[1] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[2] http://live.bb.com/sys/orgs/org/?id=1342907248
[3] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[4] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[5] http://live.bb.com/sys/orgs/org/?id=1342907248
[6] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[7] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810122
[8] http://live.bb.com/sys/orgs/org/?id=1342907248
[9] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[10] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810122
[11] http://live.bb.com/sys/orgs/org/?id=1342907248
[12] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[13] http://live.bb.com/sys/diag/viewrun.jsp?id=1600809868
[14] http://live.bb.com/sys/orgs/org/?id=1342907248
[15] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[16] http://live.bb.com/sys/diag/viewrun.jsp?id=1600809955
[17] http://live.bb.com/sys/orgs/org/?id=1342907248

I want to extract only values after jsp?id= for example from 'jsp?id=1600809955' I want to extract 1600809955 and print it, if any duplicate entry remove while printing
This thread has some great tips: https://www.linuxquestions.org/quest...le-4175598104/

...since you asked about extracting strings three years ago. And that thread contains links to several OTHER of your posts over the years, asking for scripts and similar things. Can't you apply what you've been told many times previously, and make it work for you?

You've been here FOURTEEN YEARS now, so you should be well familiar (especially since you've been told many times) with the "Question Guidelines" about doing your own research, and showing your own efforts.
 
Old 07-12-2020, 12:25 PM   #6
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 2,996

Rep: Reputation: 823Reputation: 823Reputation: 823Reputation: 823Reputation: 823Reputation: 823Reputation: 823
Code:
list="
[1] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[2] http://live.bb.com/sys/orgs/org/?id=1342907248
[3] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
"
cut -d "=" -f2 <<< "$list"

1600810147
1342907248
1342908940
Look at:
man sed
man awk
man grep
man cut
man sort
 
1 members found this post helpful.
Old 07-12-2020, 12:35 PM   #7
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 1,161

Rep: Reputation: Disabled
@teckk. That's not enough. As individual suggested above, the output of cut should be fed to sort -u. I'd also use the -s / --only-delimited option to cut here, just in case.

Besides,
Quote:
Originally Posted by sunlinux View Post
I want to extract only values after jsp?id=
Please note that not all URLs from the top post include jsp, but this may be a misrepresentation on the part of OP.

So, to not repeat you here, the same using fex:
Code:
fex '//\.jsp\?id/=2' <<<"$list"|sort -u

Last edited by shruggy; 07-13-2020 at 01:36 AM.
 
Old 07-13-2020, 12:07 AM   #8
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 1,460

Rep: Reputation: 666Reputation: 666Reputation: 666Reputation: 666Reputation: 666Reputation: 666
With awk, split by search pattern and print the RHS if not yet seen
Code:
awk -F'\.jsp\?id=' 'NF>=2 && !($2 in s) { s[$2]; print $2 }'

Last edited by MadeInGermany; 07-13-2020 at 12:10 AM.
 
1 members found this post helpful.
Old 07-13-2020, 01:49 AM   #9
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 1,161

Rep: Reputation: Disabled
@MadeInGermany. Nice.

To sum it up,

Bash:
Code:
#!/bin/bash
while IFS='=' read -r url id
do [[ $url == *.jsp\?id ]] && [[ -n $id ]] && a["$id"]=
done <"$file"
printf %s\\n "${!a[@]}"
POSIX shell:
Code:
#!/bin/sh
set --
while IFS='=' read -r url id
do case $url in *.jsp\?id)
  [ -n "$id" ] && {
    new=true
    for i
    do [ "$i" -eq "$id" ] && { new=false; break;}
    done
    $new && set -- "$id" "$@"
  }
  esac
done <"$file"
IFS='
'; echo "$*"
 
Old 07-13-2020, 02:07 AM   #10
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 15,158

Rep: Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984Reputation: 4984
Code:
sort -t= -u -k2n $file | grep -oP '(?<=jsp\?id=)\d*$'
but post #8 is probably better
 
Old 07-13-2020, 02:21 AM   #11
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 1,161

Rep: Reputation: Disabled
Why not the other way round? Seems a bit easier to me:
Code:
grep -Po '\.jsp\?id=\K\d+' "$file"|sort -u

Last edited by shruggy; 07-13-2020 at 02:57 AM.
 
Old 07-13-2020, 03:23 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,796

Rep: Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065Reputation: 3065
If we can assume the data is as presented:
Code:
awk -F= '!_[$2]++{print $2}' file
 
2 members found this post helpful.
Old 07-13-2020, 03:37 AM   #13
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 1,161

Rep: Reputation: Disabled
As I said in #7, not sure if the OP really meant what they said, but if yes then this should be
Code:
awk -F= '$1~/[.]jsp[?]id$/ && !_[$2]++, $0=$2' file
 
Old 07-14-2020, 07:18 PM   #14
Skaperen
Senior Member
 
Registered: May 2009
Location: WV, USA
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386
Posts: 2,126
Blog Entries: 20

Rep: Reputation: 150Reputation: 150
since this is a programming forum, i assume you are coding this instead of seeking a command line to do it. which language are you using? awk? bash? c? c++? go? java? lua? perl? python? rust? something else? what do you want to happen to lines without "jsp" (such as line 2)?
 
Old 07-15-2020, 06:12 AM   #15
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: Currently: openSUSE, Raspbian, Slackware. Formerly: CentOS, MacOS, Red Hat. Other: Solaris, Tru64
Posts: 2,073

Rep: Reputation: 341Reputation: 341Reputation: 341Reputation: 341
Quote:
Originally Posted by sunlinux View Post
a file contains below entries

[1] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[2] http://live.bb.com/sys/orgs/org/?id=1342907248
[3] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[4] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810147
[5] http://live.bb.com/sys/orgs/org/?id=1342907248
[6] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[7] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810122
[8] http://live.bb.com/sys/orgs/org/?id=1342907248
[9] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[10] http://live.bb.com/sys/diag/viewrun.jsp?id=1600810122
[11] http://live.bb.com/sys/orgs/org/?id=1342907248
[12] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[13] http://live.bb.com/sys/diag/viewrun.jsp?id=1600809868
[14] http://live.bb.com/sys/orgs/org/?id=1342907248
[15] http://live.bb.com/sys/orgs/org/acc/?id=1342908940
[16] http://live.bb.com/sys/diag/viewrun.jsp?id=1600809955
[17] http://live.bb.com/sys/orgs/org/?id=1342907248

I want to extract only values after jsp?id= for example from 'jsp?id=1600809955' I want to extract 1600809955 and print it, if any duplicate entry remove while printing
Code:
cat file | grep '\.jsp' | cut -d= -f2 | sort | uniq
returns
Code:
1600809868
1600809955
1600810122
1600810147
 
  


Reply

Tags
shell script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Shell Script to read all the lines from the specified string until an unmatched string anweshbabu Linux - Newbie 4 07-15-2018 05:09 PM
i am not able to concatenate both the string in shell script as below string spatil20 Linux - Newbie 16 04-24-2016 02:59 AM
Shell script to find a string and print x lines before and y lines after the string igorza Linux - Newbie 6 04-18-2013 04:31 PM
extract a string within a string using a pattern adshocker Linux - Newbie 1 11-04-2010 10:44 PM
Shell Script: Delete lines til string found or until particular string. bhargav_crd Linux - General 3 12-20-2007 11:14 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 10:15 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration