LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Blogs > sashi_hc
User Name
Password

Notices



Rate this Entry

XML - beautifying

Posted 04-02-2014 at 08:19 AM by sashi_hc
Updated 04-02-2014 at 11:05 AM by sashi_hc

You want to make your XML more readable
You have just extracted an XML from a source and it is all un-indented and so not very readable. That happens in my workplace where a single XML can be many MBs in size.

Using xmllint
xmllint --format <your_existing_xml_file_name> > new_xml_file_name

If you do not have xmllint installed, you have the perl method. This is slower, but works the same. This is not my code but see the author info. Works good.

#!/usr/bin/perl
#
# Purpose: Read an XML file and indent it for ease of reading
# Author: RedGrittyBrick 2011.
# Licence: Creative Commons Attribution-ShareAlike 3.0 Unported License
#
use strict;
use warnings;

my $filename = $ARGV[0];
die "Usage: $0 filename\n" unless $filename;

open my $fh , '<', $filename
or die "Can't read '$filename' because $!\n";
my $xml = '';
while (<$fh>) { $xml .= $_; }
close $fh;

$xml =~ s|>[\n\s]+<|><|gs; # remove superfluous whitespace
$xml =~ s|><|>\n<|gs; # split line at consecutive tags

my $indent = 0;
for my $line (split /\n/, $xml) {

if ($line =~ m|^</|) { $indent--; }

print ' 'x$indent, $line, "\n";

if ($line =~ m|^<[^/\?]|) { $indent++; } # indent after <foo
if ($line =~ m|^<[^/][^>]*>[^<]*</|) { $indent--; } # but not <foo>..</foo>
if ($line =~ m|^<[^/][^>]*/>|) { $indent--; } # and not <foo/>

}
Posted in Uncategorized
Views 298 Comments 0
« Prev     Main     Next »
Total Comments 0

Comments

 

  



All times are GMT -5. The time now is 08:14 AM.

Main Menu
Advertisement

Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration