[SOLVED] bash remove random text from command output
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
8:15am random text here 99629360 random text here
9:00am random text here 99799779 random text here
10:00am random text here 99102831 random text here
11:45am random text here 99629320 random text here
12:30pm random text here 96678497 random text here
2:30pm random text here 99762314 random text here
3:00pm random text here 99833711 random text here
5:15pm random text here 99305212 random text here
6:00pm random text here 96500528 random text here
7:00pm random text here 99711372 random text here
This is 2 columns. One shows a time and the other some random text and some random numbers.
I need to remove all random text from the second column. I need to be left with the first column showing times and the second column showing only numbers. No random text.
Using BASH and the cut comment? (man cut)
There is also some pattern matching ability within bash that might serve. (man ash)
Or, if you want to combine your bash with something other than cut or the internal pattern matching and string handling:
With a simple PERL script?
I would not use AWK/NAWK, but only becauseI like PERL better. IT is certainly up to the task.
What exactly are your limits, and what are the characteristics of the text you are calling random?
I seriously doubt if the text is random, as generating truly random text would be seriously challenging.
Does any of the random text that you want stripped contain digits? If not, there is a clue as to how to craft your pattern matching to pick out the desired fields for display!
What exactly have you tried so far? How did that work for you?
ok, lets not focus on removing the random text then.
The similarities in the numbers are that they all start with 99, 96 or 95.
Would this help in keeping just the numbers, instead of focusing on removing the random text?
Also the numbers are always 8 digits.
As I say, concentrate on what you need, not what you don't need. How good are your regex skills - do you understand back references ? For
example this will select only 8 consecutive digits - the same regex will work in sed.
while read -r f
do
echo $f
echo $(echo "$f" | cut -d" " -f1-6)
done < data
considering it is all like data.
7:00pm random text here 99711372 random text here
7:00pm random text here 99711372
I highly doubt that the random text referred to by the OP are the three actual words "random text here" which wouldn't really be random. The number of fields in actual random text is unknown.
I highly doubt that the random text referred to by the OP are the three actual words "random text here" which wouldn't really be random. The number of fields in actual random text is unknown.
do you always take everything out of context to try and prove a point, not paying attention to detail, or even trying to?
You should not take what I did out of context, (or any one for that matter) to try and prove your point, that is being dishonest by obscuring the truth of the matter. You removed my final statement, and the beginning of the OP's statement to try and prove your point. my final statement, "if that is what you are looking for." Which clearly means what?
and he OP clearly stated, "I need to remove all random textfrom the second column."
yes one then needs to guess what the second column really is, seeing that the keys words used here is random text, and second column , that is where I'd start, on the second random and remove all of it from there, because he too use the word ALL in the start of his sentence.
the use of the words "remove all" means what in conjunction with the rest of the sentence "random text from the second column"?
the guessing part is, is it is really random text?
If it is to be what you are trying to imply then a less confusing sentence for you would then be to say. "I need to just remove the second word random from the string." Which is no doubt more explicit.
if the OP actually means just remove the second word random from the string, then he or she really needs to work on there English more as well as Linux anything. Which that part is conjecture and the person in question is not here to comment on this.
I tend to be a reluctant user of "pure" bash, but further testing reveals its regex matching and BASH_REMATCH[] actually works very well here. Without the need for any external program like sed/awk/perl ...
Off the top of my head, here is one solution I came up with.
Code:
while read line ; do if [[ $line =~ ^([^ ]+)[[:space:]].*[[:space:]]([0-9]+)[[:space:]].* ]] ; then echo ${BASH_REMATCH[1]}" "${BASH_REMATCH[2}} ; fi ; done < your.file
ok, lets not focus on removing the random text then.
The similarities in the numbers are that they all start with 99, 96 or 95.
Would this help in keeping just the numbers, instead of focusing on removing the random text?
Also the numbers are always 8 digits.
so you do not really need to remove the random text, but are needing to remove the eight digit numbers from the complete string?
Or are you now wanting to remove everything but the numbers?
Or you just need to remove everything after the 8 digit numbers?
as I stated that if your data is always the same then cut as I did will always work.
what is the actual criteria(s) that someone might have told you that you need to do to complete this task?
Verbatim please.
--- mod: now seeing what others have (just?) done as I was posing this. --- let me go see what they think you are now trying to says.
that BASH_REMATCH
gets this with me
Code:
$ ./bashremove
./bashremove: line 1: ${BASH_REMATCH[1]}" "${BASH_REMATCH[2}}: bad substitution
this gets the second occurrence of the WORD random removed from the strings.
Code:
$ sed 's@random@@2' data
8:15am random text here 99629360 text here
9:00am random text here 99799779 text here
10:00am random text here 99102831 text here
11:45am random text here 99629320 text here
12:30pm random text here 96678497 text here
2:30pm random text here 99762314 text here
3:00pm random text here 99833711 text here
5:15pm random text here 99305212 text here
6:00pm random text here 96500528 text here
7:00pm random text here 99711372 text here
again what exactly is the results you are looking for?
just keeping the 8 digit numbers, or now removing from the start of the 8 digits, and where exaclly is the second column starting?
Code:
column
roll 0 1 2 3 4 5 6 7 8
1 7:00pm random text here 99711372 random text here
2
is that a correct assessment?
try showing us a final product you are looking for so we all can then know what to figure out in how to get that. as a picture speaks a thousands words.
I really didn't mean to cause a whole discussion for this. I though my question was clear, but it seems I was wrong. When I said "random text here" I didn't actually mean that the words "random text here" were in the command output. Random text could be anything. It could be a name, place or food. Sorry if I wasn't clear.
The point is, like I originally said, I need to be left with the first column showing the times and the second column showing only numbers. So no need to touch the first column. I just want to modify the second column so that it only shows the numbers. I have to remove all the text, only from the second column.
I really didn't mean to cause a whole discussion for this. I though my question was clear, but it seems I was wrong. When I said "random text here" I didn't actually mean that the words "random text here" were in the command output. Random text could be anything. It could be a name, place or food. Sorry if I wasn't clear.
The point is, like I originally said, I need to be left with the first column showing the times and the second column showing only numbers. So no need to touch the first column. I just want to modify the second column so that it only shows the numbers. I have to remove all the text, only from the second column.
this is how you show an example of what you're looking for.
example #1
Code:
#before
7:00pm random text here 99711372 random text here
#after
7:00pm 99711372
removing all random text within the string.
this is what you want?
or this
example #2
Code:
#before
7:00pm random text here 99711372 random text here
#after
7:00pm 99711372 random text here
#!/bin/bash
#array of strings
data=(
"8:15am random text here 99629360 random text here"
"9:00am random text here 99799779 random text here"
"10:00am random text here 99102831 random text here"
"11:45am random text here 99629320 random text here"
"12:30pm random text here 96678497 random text here"
"2:30pm random text here 99762314 random text here"
"3:00pm random text here 99833711 random text here"
"5:15pm random text here 99305212 random text here"
"6:00pm random text here 96500528 random text here"
)
for ((i=0;i<${#data[@]};i++))
do
part1=$( echo ${data[$i]} | sed 's/[A-Za-z]*//g' | fmt -u )
part2=$( echo ${data[$i]} | sed 's/[A-Za-z]*//g' | awk '{print $1 " " $2 " " $5}' )
echo "p1 $part1"
echo "p2 $part2"
echo
#split the string to keep the am or pm on the leading part of string.
part3=${data[$i]%% *}
part4=$(echo ${data[$i]} | sed 's/[^0-9]*//g')
echo
echo "p3 $part3"
echo "p4 $part4"
echo "
final product is:
$part3 $part4
"
done
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.