[SOLVED] Perl regex not matching across multiple lines despite ms flags
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
/**
* UIKit initialiser class
*/
var UIKit = {
init: function init() {
this.registry.each(function(entry, key) {
$$(entry.selector).each(function(item, index){
if(item.hasClass("no-replace") || $chk(item.retrieve("uikit"))) {
return;
}
if(eval("window."+entry["class"]) !== undefined) {
var ui = eval("new " + entry["class"] + "(item)");
item.store("uikit", ui);
}
});
}, this);
},
registry: new Hash(),
register: function register(className, selector) {
this.registry.set(className, {
"class": className,
"selector": selector
});
},
enhance: function enhance(element, uiclass) {
if(!(element.hasClass("no-replace") || $chk(element.retrieve("uikit")))) {
if(!$chk(uiclass)) {
for(var name in this.registry) {
var item = this.registry[name];
if($$(item.selector).contains(element)) {
uiclass = eval(item["class"]);
break;
}
}
if(!$chk(uiclass)) { return false; }
}
var ui = new uiclass(element);
element.store("uikit", ui);
return true;
}
return false;
}
}
/**
* UIKit retrieval function.
*
* @param string|element el. The DOM element or element id of the element for which you want the UIKit to be retrieved.
* @return UI|false The UI derived object or false if none is set.
*/
function $UI(el) {
//First check if el is an instanceof UI
if(el instanceof UI) {
return el;
} else {
//Otherwise get it
el = $(el);
if(!el) {
throw new UIKitException("Invalid argument for $UI.");
}
if(!$chk(el.retrieve("uikit"))) {
throw new UIKitException("Passed object does not have an associated UI object.");
} else {
return el.retrieve("uikit");
}
}
}
Output:
Code:
/**
* UIKit initialiser class
*/
var UIKit = {
init: (function init() {
this.registry.each(function(entry, key) {
$$(entry.selector).each(function(item, index){
if(item.hasClass("no-replace") || $chk(item.retrieve("uikit"))) {
return;
}
if(eval("window."+entry["class"]) !== undefined) {
var ui = eval("new " + entry["class"] + "(item)");
item.store("uikit", ui);
}
});
}, this);
}),
registry: new Hash(),
register: (function register(className, selector) {
this.registry.set(className, {
"class": className,
"selector": selector
});
}),
enhance: (function enhance(element, uiclass) {
if(!(element.hasClass("no-replace") || $chk(element.retrieve("uikit")))) {
if(!$chk(uiclass)) {
for(var name in this.registry) {
var item = this.registry[name];
if($$(item.selector).contains(element)) {
uiclass = eval(item["class"]);
break;
}
}
if(!$chk(uiclass)) { return false; }
}
var ui = new uiclass(element);
element.store("uikit", ui);
return true;
}
return false;
})
}
/**
* UIKit retrieval function.
*
* @param string|element el. The DOM element or element id of the element for which you want the UIKit to be retrieved.
* @return UI|false The UI derived object or false if none is set.
*/
function $UI(el) {
//First check if el is an instanceof UI
if(el instanceof UI) {
return el;
} else {
//Otherwise get it
el = $(el);
if(!el) {
throw new UIKitException("Invalid argument for $UI.");
}
if(!$chk(el.retrieve("uikit"))) {
throw new UIKitException("Passed object does not have an associated UI object.");
} else {
return el.retrieve("uikit");
}
})
}
@gfarrell Ok the code seems to work fine only that it still doesn't work on lines that contains multiple }'s. Before I really had an idea that it could also be done in awk but this was really the limit that I was expecting.
@grail There's a problem in the RS="" method if a section contains blank lines within.
Last edited by konsolebox; 08-17-2010 at 11:26 AM.
@gfarrell Ok the code seems to work fine only that it still doesn't work on lines that contains multiple }'s. Before I really had an idea that it could also be done in awk but this was really the limit that I was expecting.
@grail There's a problem in the RS="" method if a section contains blank lines within.
I think you just worked out my problem, multiple braces in a line! Thanks =]
(Not that I know how to fix it (or really need to now)).
The other problem I got was in comment doc-blocks but it was largely not a problem.
Ok here are just sad some points. I'm sorry if I have to tell you these.
I already had same problem with my own purpose but I just gave it up and started to think about letting it get solved in other languages... like perl or parrot. Awk really have its limits. It's not only about counting the total braces then making a deduction in every recursion. It's also about knowing if the braces are just part of an ordinary string or not or part of ... etc. Also, what if there are 4 braces in a line, 3 of them is part of the current function but the 4th is part of the container block holding the function. How can you tell that it's part of the container block since you're only in the context of the function?
Probably this can still be solved but that would only mean imitating a real language parser. Still doing that in awk IMO is really no longer practical. e.g. reading single chars and not phrases or lines since you can't tell when do compound statements or blocks ends or separates.. etc.
P.S. Maybe using another script that's similar to HTML TIDY for awk scripts then using your methods will do the trick.
Last edited by konsolebox; 08-17-2010 at 11:50 AM.
Ok here are just sad some points. I'm sorry if I have to tell you these.
I already had same problem with my own purpose but I just gave it up and started to think about letting it get solved in other languages... like perl or parrot. Awk really have its limits. It's not only about counting the total braces then making a deduction in every recursion. It's also about knowing if the braces are just part of an ordinary string or not or part of ... etc. Also, what if there are 4 braces in a line, 3 of them is part of the current function but the 4th is part of the container block holding the function. How can you tell that it's part of the container block since you're only in the context of the function?
Probably this can still be solved but that would only mean imitating a real language parser. Still doing that in awk IMO is really no longer practical. e.g. reading single chars and not phrases or lines since you can't tell when do compound statements or blocks ends or separates.. etc.
P.S. Maybe using another script that's similar to HTML TIDY for awk scripts then using your methods will do the trick.
While I realise that the method I used was not perfect (and the regex I used was (it worked during testing, just not in a bash script with perl), it worked for my purpose and therefore I'm happy enough with it. In terms of the code files I was parsing, none of those problems were encountered except in one file which I did manually, it still saved me time in the other 29 files. I really can't be bothered to write a proper code parser because then, as you say, I'd be basically writing an interpreter and I have absolutely no interest in doing that.
I hope you manage to work out the problems you were encountering but for me it's done its job.
Ok I am really going this time, but I did run your several times and was unable to get it to not print the extra bracket
This is a little untidy due to tiredness but seems to work You can take any bits that help you
Well, this conversation really moved beyond me while I was away. Time to bow out, I think.
Quote:
Originally Posted by konsolebox
I searched again. It appears that it can also be done in sed:
Yes, of course I know it's possible. However, as you just demonstrated, it takes a lot of mucking about with the hold buffer to build up the line before you can run the regex on it. But if sed had s & m switches similar to perl's (hmm, that doesn't sound right...), then you'd be able to simply choose to treat the newlines like any other character straight out of the starting gate. Much easier to grasp conceptually and more flexible overall.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.