ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Depending on the surrounding content you may need to make the expressions a bit more specific.
LQ does not operate like a help desk where you get complete solutions, so it is better if you try to solve a problem yourself then ask for help when you get stuck, showing us what you have done first. This also tells others what you language preferences are and how you have framed the task in your own mind, which can help produce better answers for your case.
Please review the Site FAQ for guidance in posting your questions and general forum usage.
Good luck!
Last edited by astrogeek; 10-01-2018 at 06:43 PM.
Reason: better grammar, typos
I have tried getting the line numbers whre I get a match for "Cookies Consent Notice",
then using awk to get the two line numbers where the match occurs, then remove lines
between the two line numbers.
You need to be very specific (and very correct) when asking for help. astrogeek gave an answer to what you asked, but probably not what you meant.
As for your own solution, it appears you are confusing regex pattern with address specifiers. Why are you using regex at all in that awk ? - just print $1.
Thanks for the response, but I am now confused about what your ultimate goal is.
Per your original question I think that you are trying to strip the 'Consent Notice' script lines from an HTML document.
If that is the case I would not use the line numbers at all as it is at least an unnecessary step and confusing as it may obscure what you are really trying to accomplish. That said, if the line numbers are important for some other reason then you should tell us what that is.
It is also unclear in your original question whether you want to remove the 'Consent Notice' lines themselves, or strictly the lines between them. My earlier example removes only the lines between, but can be easily modified to remove those lines as well.
One effective way to ask text processing questions is to provide an example input file, which you have done, and show what you would expect the actual result to look like.
Again, using your original question as the guide, this would be a possible example...
Code:
Source file looks like this...
<HTML>
<Other stuff> Goes here...
<!-- Cookies Consent Notice (Production CDN, www.test.com, en-US) start -->
<script src="https://cdn.cookielaw.org/consent/bcd12cae-aa3c-4ffb-ac51-8d41462cdcb4.js" type="text/javascript" charset="UTF-8"></script>
<script type="text/javascript">
function OptanonWrapper() { }
</script>
<!-- Cookies Consent Notice (Production CDN, www.test.com, en-US) end -->
</Other stuff>
</HTML>
Code:
Result output should look like this...
<HTML>
<Other stuff> Goes here...
<!-- Cookies Consent Notice (Production CDN, www.test.com, en-US) start -->
<!-- Cookies Consent Notice (Production CDN, www.test.com, en-US) end -->
</Other stuff>
</HTML>
Is that what you expect?
Last edited by astrogeek; 10-01-2018 at 08:39 PM.
Reason: typo(s)
use the correct tool for the job.
sed & awk & co. are not so suited for HTML and XML.
i would use something like xmlstarlet to correctly identify the script element (maybe through "src=*cdn.cookielaw.org*" or some such) and remove it.
but i don't know what the subsequent OptanonWrapper() is about. i know nothing about javascript.
One effective way to ask text processing questions is to provide an example input file, which you have done, and show what you would expect the actual result to look like.
I was wondering about this at first, and then I did "View Page Source." That is the HTML source referenced in the OP.
If astrogeek accurately described what you're trying to do, this little script does it:
Code:
#!/usr/bin/awk -f
/<!-- Cookies Consent Notice \(Prod.+www.test.com.+\) end -->/ { del = 0 }
! del { print $0 }
/<!-- Cookies Consent Notice \(Prod.+www.test.com.+\) start -->/ { del = 1 }
You can either put the html file on the command line or pipe it in.
Notice that I used ".+" as a wildcard to match multiple characters. If you need to be more precise, you could replace them with actual text. I'm using "del" as a variable. All variables in awk are auto-initialized to zero.
If you want to delete the "Cookie Consent" comments also, you can swap the two long lines.
Before we begin to debug your bash code, please take the time to again read through the suggestions offered, and questions asked to this point.
If your questions are to be based around a certain input file, abaca.html for example, then please post the relevant contents of that file along with an example of what you expect the result to be.
Your bash code is incomplete and mostly irrelevant to the questions asked, so let's not go there until we know what it is you are trying to accomplish.
This is also the first indication that you want to write this into an interactive script which takes arguments to be used in the replacement operation. Please tell us clearly, and precisely what you are trying to accomplish so that we can try to resolve one problem at a time - help us help you!
Please review the Site FAQ for guidance in asking well formed questions.
Last edited by astrogeek; 10-02-2018 at 05:00 PM.
Reason: typo
It is also helpful if you can post your solution here so that others who are looking for a solution to similar problems can benefit from the experience.
When you are satisfied that the problem is solved please use the Thread Tools list at top of the first post to mark the thread as SOLVED.
Then it seems you've changed your requirement on the road because now you don't want the "Cookies Consent Notice" lines anymore.
Anyway, a very good starting point has been given in #2 as you only need to change a few characters to make it work for your case.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.