Extracting text at a dynamic location
Consider a file that is structured like this
Code:
Code:
1. Locate the last task listed in * tail section I'm curious if there's a simple sed/awk solution to this. If not, I will turn to python. But simpler is better. |
This sounds familiar... https://www.linuxquestions.org/questions/programming-9/awk-sed-bash-script-to-display-last-changelog-entry-4175716978
That's a very similar problem, so yes, Awk can do this too, and you should be able to apply what you've learned from that thread. Here are some hints: Use "\n\* " as a record separator, "\n\*\* " as a field separator, and "$NF" to refer to the last field in a record. Using that information have a go yourself and if you get stuck show your efforts. |
Thanks boughonp! A little update, I need to access last line from a field that has multiple lines,
like so : Quote:
Quote:
|
There's a few ways to access the last line of a variable, perhaps the simplest is to split on newlines and access the last element of the resulting array.
Awk's split is slightly differently to other languages: Code:
my_array_len = split(input_string,my_array,"\n"); (If there's a trailing newline in the input, a -1 could be added to counter that.) |
Or pipe it to tail -1
Code:
awk ... | tail -1 |
Thank you boughtonp! that code will save me a gratuitious call to length() as in my code
Code:
18:25:20 ~ -1- $ awk -v RS='\n\\* ' -v FS='\n\\*\\* ' 'NR==3 {fieldno=NF-3; split($fieldno,A,"\n"); print A[length(A)] }' ~/NOTES/LOG/TASKS/nouvelle-vm-dns.flow |
The code is now in its own file
Code:
The problem is that task variable contains special characters "[" "]", so the if ($i ~ task) condition will never meet. With the other test if ($i ~ "learning to use getopts") I get a match : Code:
12:19:53 ~/CODE/TMP -2- $ ./awk ~/NOTES/LOG/TASKS/nouvelle-vm-dns.flow Code:
12:33:22 ~/CODE/TMP -2- $ ./awk ~/NOTES/LOG/TASKS/nouvelle-vm-dns.flow |
To do a non-regex find, use index(haystack,needle) - returns position of match, with a starting string returning 1.
But it can be useful to convert a string to a regex pattern, by adding backslashes where required: Code:
gsub(/[$^*()+\[\]{}.?\\|]/,"\\\\&",text); |
gsub didn't work
Code:
NR==4 { but index did. Code:
|
The "text" bit was intended to be generic - in your context you'd want something like:
Code:
... |
oops, you were right ^^', didn't check the name of the variable.
But there's something intruiguing, if I use the original variable tasks I get a very big load of backslashes printed out (see https://i.imgur.com/mgoKUtJ.png). Code:
NR==4 { Code:
NR==4 { |
Code:
gsub(/[$^*()+\[\]{}.?\\|]/,"\\\\&",task); |
Quote:
Unless there's some documented reason for it, you should check which implementation + version of Awk you're using and probably raise it as a bug. edit: I was distracted earlier - it's because the gsub is occurring inside a loop, and each iteration adds/doubles backslashes. If used, it should be done prior to the loop. Quote:
I was a little surprised it wasn't already a built-in function. |
Oh my dear :eek:
Go for the index() function! |
Quote:
Also, I wasn't paying attention earlier - the excess slashes are due to the replace being performed inside a loop; if used it needs to only be done once (hence why resetting the variable immediately prior hid the issue). |
All times are GMT -5. The time now is 05:21 AM. |