Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back > Blogs > sashi_hc
User Name


Rate this Entry

XML - beautifying

Posted 04-02-2014 at 08:19 AM by sashi_hc
Updated 04-02-2014 at 11:05 AM by sashi_hc

You want to make your XML more readable
You have just extracted an XML from a source and it is all un-indented and so not very readable. That happens in my workplace where a single XML can be many MBs in size.

Using xmllint
xmllint --format <your_existing_xml_file_name> > new_xml_file_name

If you do not have xmllint installed, you have the perl method. This is slower, but works the same. This is not my code but see the author info. Works good.

# Purpose: Read an XML file and indent it for ease of reading
# Author: RedGrittyBrick 2011.
# Licence: Creative Commons Attribution-ShareAlike 3.0 Unported License
use strict;
use warnings;

my $filename = $ARGV[0];
die "Usage: $0 filename\n" unless $filename;

open my $fh , '<', $filename
or die "Can't read '$filename' because $!\n";
my $xml = '';
while (<$fh>) { $xml .= $_; }
close $fh;

$xml =~ s|>[\n\s]+<|><|gs; # remove superfluous whitespace
$xml =~ s|><|>\n<|gs; # split line at consecutive tags

my $indent = 0;
for my $line (split /\n/, $xml) {

if ($line =~ m|^</|) { $indent--; }

print ' 'x$indent, $line, "\n";

if ($line =~ m|^<[^/\?]|) { $indent++; } # indent after <foo
if ($line =~ m|^<[^/][^>]*>[^<]*</|) { $indent--; } # but not <foo>..</foo>
if ($line =~ m|^<[^/][^>]*/>|) { $indent--; } # and not <foo/>

Posted in Uncategorized
Views 655 Comments 0
« Prev     Main     Next »
Total Comments 0




All times are GMT -5. The time now is 08:22 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration