LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Help with vim regular expression (https://www.linuxquestions.org/questions/linux-software-2/help-with-vim-regular-expression-4175647028/)

LastC 01-26-2019 08:10 PM

Help with vim regular expression
 
I've been trying to use this command
Code:

:%s/\(src|href)="\(assets\/.+?\)/\1="{% static '\2' %}"/g
I found in this stack overflow post, but whenever I try to use it I get "E54: Unmatched \(".

What I've been trying to do is to change all the
Code:

href=" (static file directory) "
lines on a html template to
Code:

(href/src)="{% static "(page source)" %}"
, so I can use it with django.

Notes: By the way, please forgive me for my lack of knowledge of vim, I've hardly ever used it before. I'm using it now since it seems to be the quickest way to solve my problem, but if you have any other suggestions, please feel free to let me know.

berndbausch 01-26-2019 08:29 PM

Quote:

Originally Posted by LastC (Post 5953924)
I've been trying to use this command
Code:

:%s/\(src|href)="\(assets\/.+?\)/\1="{% static '\2' %}"/g
I found in this stack overflow post, but whenever I try to use it I get "E54: Unmatched \(".

The expression has two opening "\(" and one closing "\)". You need to close each "\(". Thus the error.
By the way, I believe that you can't nest \(...\). An expression like \(....\(...\)...\) is illegal.
Quote:

Originally Posted by LastC (Post 5953924)
What I've been trying to do is to change all the
Code:

href=" (static file directory) "
lines on a html template to
Code:

(href/src)="{% static "(page source)" %}"
, so I can use it with django.

Why don't you simply replace the first constant expression with the second, such as
Code:

1,$s/href=" (static file directory) "/(href\/src)="{% static "(page source)" %}/
The backslash removes the special meaning of the slash that immediately follows it. I am not sure if characters like {, }, % need a backslash as well.
Quote:

Originally Posted by LastC (Post 5953924)
Notes: By the way, please forgive me for my lack of knowledge of vim, I've hardly ever used it before. I'm using it now since it seems to be the quickest way to solve my problem, but if you have any other suggestions, please feel free to let me know.

Apology accepted :)

LastC 01-26-2019 09:03 PM

Quote:

The expression has two opening "\(" and one closing "\)". You need to close each "\(". Thus the error.
By the way, I believe that you can't nest \(...\). An expression like \(....\(...\)...\) is illegal.
Okay, so I guess I found out how the command is suppose to look like. (Assuming it is not illegal as you pointed out)
Code:

:%s/\(src|href\)="\(assets\/.+?\)/\1="{% static '\2' %}"/g
Do you think this one's correct? I mean, it still doesn't run, but now I get the following error message instead:
Code:

E486: Pattern not found: \(src|href\)="\(assets\/.+?\)


Quote:

Why don't you simply replace the first constant expression with the second, such as
Code:

1,$s/href=" (static file directory) "/(href\/src)="{% static "(page source)" %}/

When I try it out, I get the error message

Code:

E486: Pattern not found: href=" (static file directory) "
I guess the problem is that Vim is literally looking for lines containing 'href=" (static file directory) "'. What I meant by "(static file directory)" to "(page source)" was that there could be lines that look like this
Code:

<link rel="stylesheet" href="style/login.css">
and those lines need to be change so they'll look like this
Code:

<link rel="stylesheet" href="{% static "style/login.css" %}">
Sorry for not explaining myself clearly there...

syg00 01-26-2019 09:24 PM

The problem is not your lack of knowledge of vim per-se, but your ignorance of regex. For vim you can get a cheat-sheet and get up to speed easily and quickly; regex is akin to voodoo for the naif.
You have to specifically mould it for your data - if it doesn't exactly fit the input for the answer you found (we now know it doesn't), it won't work. But can be easily adapted as @berndbausch attempted above. But for that we need to know input data structure - examples in full; "there could be lines that look like this" won't suffice.
Ditto for expected output.

Personally I'd use sed - it's much more obvious what's happening IMHO.

LastC 01-26-2019 09:52 PM

Quote:

The problem is not your lack of knowledge of vim per-se, but your ignorance of regex. For vim you can get a cheat-sheet and get up to speed easily and quickly; regex is akin to voodoo for the naif.
You have to specifically mould it for your data - if it doesn't exactly fit the input for the answer you found (we now know it doesn't), it won't work. But can be easily adapted as @berndbausch attempted above. But for that we need to know input data structure - examples in full; "there could be lines that look like this" won't suffice.
Ditto for expected output.
Fair enough! Okay, so to put it more clearly, let me explain the whole process behind what I want to do.



The idea is to find all the lines with a "src" or "href" tags on the html templates, which will look like so
Code:

<link rel="stylesheet" href="style/login.css">
Here 'href="style/login.css"' refers to the location of the file linked in the tag, where 'href=' is the tag and '"style/login.css"' is the file directory. However, when using Django, all the links need to be changed so they'll refer to static files instead. This is done by changing the file directory from '"style/login.css"' to '"{% static "style/login.css" %}"'.

The '"{% static' and '%}"' need to enclose the '"style/login.css"' directory so Django will know what files the 'href' tag is referring to. So, the final line should look like this
Code:

<link rel="stylesheet" href="{% static "style/login.css" %}">
More examples
Code:

From:
<link href="css/styles.css" rel="stylesheet">
To:
<link href="{% static css/styles.css %}" rel="stylesheet">

From:
<link href="css/bootstrap-override.css" rel="stylesheet">
To:
<link href="{% static css/bootstrap-override.css %}" rel="stylesheet">

I would like to learn how to solve this kind of problems with tools like sed if possible, however, I could also try to just write a python script for it. but I feel this could be a good opportunity to learn how to use some new tools, so what gives!

syg00 01-27-2019 02:54 AM

sed is a "stream editor" - you feed it a file(s) and tell it what to do with it. Regex like you found is just the ticket. It will spit out modified lines that match, or else the input record unmodified.
The issue with the solution you found is that it was structured for hrefs that contain the literal string "assets" - so even if the syntax error wasn't there, it was only indicative for you. The explanation in the link is a good synopsis.

However ... using regex you have to define the data precisely. In the original answer it was "assets" and everything that followed it - then add the trailing "%}".
Several problems with your data - not all records have the same structure (first record doesn't has css/ as leading dir. Some records have trailing "rel=" keyword, first one doesn't. Can be accommodated, but makes the regex more difficult. Consistency counts for a lot.

This seems to handle the (limited) input provided.
Code:

sed -r 's/(src|href)="([^[:space:]>]+?)/\1="{% static \2 %}"/' your_file.here

LastC 01-27-2019 11:49 AM

Quote:

sed is a "stream editor" - you feed it a file(s) and tell it what to do with it. Regex like you found is just the ticket. It will spit out modified lines that match, or else the input record unmodified.
The issue with the solution you found is that it was structured for hrefs that contain the literal string "assets" - so even if the syntax error wasn't there, it was only indicative for you. The explanation in the link is a good synopsis.

However ... using regex you have to define the data precisely. In the original answer it was "assets" and everything that followed it - then add the trailing "%}".
Several problems with your data - not all records have the same structure (first record doesn't has css/ as leading dir. Some records have trailing "rel=" keyword, first one doesn't. Can be accommodated, but makes the regex more difficult. Consistency counts for a lo
Understood. So would you recommend any good tutorials or resources for learning regex? I know a quick google search will give me a whole lot of places to look at, but most of are a bit hard to understand (I really don't get the syntax yet, like at all).


Quote:

Code:

sed -r 's/(src|href)="([^[:space:]>]+?)/\1="{% static \2 %}"/' your_file.here

Thanks a lot!, I managed to get it working on a script, plus a small modification. It was working fine, but the directory inside the static tag was missing a ". I fixed it by changing the command to
Code:

ed -r -i 's/(src|href)="([^[:space:]>]+?)/\1="{% static "\2 %}"/'
I know it's nothing really, but I still feel kinda proud for it.

By the way, here is a script I wrote (again, pretty simple stuff, but it's good to have a copy-paste solution) to make html templates work with Django's static files directory, just in case someone needs it
Code:

#!/bin/bash

sed -i '1s/^/{% load static %}\n/' "$1"
sed -r -i '
s/(src|href)="([^[:space:]>]+?)/\1="{% static "\2 %}"/
' "$1"


dugan 01-27-2019 12:12 PM

Hint: use "\v" (very magic mode) so that you don't need as much escaping.

http://andrewradev.com/2011/05/08/vim-regexes/

There's a missing backslash very, very early in the expression at the top of your first post.

LastC 01-27-2019 12:20 PM

Quote:

There's a missing backslash very, very early in the expression at the top of your first post.
Uhmm, where exactly?
Quote:

Hint: use "\v" (very magic mode) so that you don't need as much escaping.

http://andrewradev.com/2011/05/08/vim-regexes/
And thanks!, I'll give it a shot as soon as I finish with this site.

dugan 01-27-2019 12:27 PM

Code:

:%s/\(src|href)
The first parenthesis is escaped but the second one isn't.

I see you corrected that in later posts, but the point was that you were less likely to make that mistake with very-magic mode on.

FlinchX 02-02-2019 01:02 PM

At the risk of derailing the topic: the internet is full of warnings against parsing html/xml with regular expressions.


All times are GMT -5. The time now is 08:42 PM.