hi ^^
I've started writing a little regex XML Parser in perl some time ago, just to practice regular expressions again...it works so far but I haven't come up with an idea how to make it possible to process nested XML tags (I actually mean nested tags with the same name =^ ), such as:
Code:
<data>
<data>test</data>
</data>
without writing hundreds of lines for instance counting or similar solutions...
the current code is
Code:
# Simple Regex XML Parser
# 2003 by poison
open (XMLDATA, '< /home/poison/files/asz00m5ynC/sync.xml') or die "error, can't open file";
@content= <XMLDATA>;
close (XMLDATA) or die "error closing";
#print "@content";
$file_data=join ('', @content );
parse_xml ($file_data);
sub parse_xml {
my $match_data=shift;
my $match_level=shift;
my $tag_matched;
my $tag_parameters;
my %tag_parameter;
my $tag_content;
while ( $match_data =~ m/<([^\/>][^> ]*)([^>]*)>([\s\S]*?)<\/\1>/gi ) {
#extract parameter list and content from xml tag, seperate
$tag_matched=$1;
$tag_parameters=$2;
%tag_parameter=parse_parameters ($tag_parameters);
$tag_content=$3;
#print $1 . '=>' . $2;
#$match_data=$2;
print (' ' x $match_level);
#print $tag_matched;
create_cfg ($tag_matched, $tag_parameters, $tag_content);
if ($tag_content =~ m/<([^\/>][^> ]*)([^>]*)>([\s\S]*?)<\/\1>/gi) {
print $tag_matched . ':';
while (($key, $value) = each %tag_parameter) {
print $key . '=>"' . $value . '" ';
}
print "\n";
} else {
print $tag_matched . ':';
while (($key, $value) = each %tag_parameter) {
print $key . '=>"' . $value . '" ';
}
print '("' . $tag_content . "\")\n"
}
parse_xml ($tag_content, $match_level+1);
}
}
#print $file_data;
sub parse_parameters {
#split up parameter list
my $parameters=shift;
my %parameter;
while ( $parameters =~ /([^ =]+)\s*=\s*"([^"]*)"/gi ) {
$parameter{$1}=$2;
}
return %parameter;
}
sub create_cfg {
my $tag_matched=shift;
my $tag_parameters=shift;
my $tag_content=shift;
my %tag_parameter=parse_parameters ($tag_parameters);
if ($tag_matched eq 'database') {
@source=($tag_parameter{'source'}, $tag_parameter{'destination'});
}
}
suggestions anybody ? ^^