LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How do I work with large XML files in PHP? (https://www.linuxquestions.org/questions/programming-9/how-do-i-work-with-large-xml-files-in-php-705047/)

RavenLX 02-18-2009 11:14 AM

I should mention that my perl script to dump the data looks something like this. Note that I'm just using example data but the hash of arrays structure is the same.

Code:

#!/usr/bin/perl
use strict;
use XML::Simple;
use Data::Dumper;

my %arr_hash = ();
$arr_hash{'produce'}{'veggies'}[0] = "Broccoli";
$arr_hash{'produce'}{'veggies'}[1] = "Cauliflower";
$arr_hash{'produce'}{'veggies'}[2] = "Carrots";

my $xsimple = XML::Simple->new();

# Write XML File:
open(XMLPTR, ">sample.xml") or die "Cannot create xml file.\n";
print XMLPTR $xsimple->XMLout(\%arr_hash,
                      noattr => 1,
                      xmldecl => '<?xml version="1.0"?>');
close(XMLPTR);

This is what I'll change and fix so that maybe I can get a better xml file in the end. I've got pretty decent at perl so I think even if I had to manually create the file and not use XML::Simple I could still do it.

RavenLX 02-19-2009 01:33 PM

Resolved: How do I work with large XML files in PHP?
 
Ok, I've resolved the problem. The reason SAX wasn't working was because there were ampersands (&) in the XML file and so I adjusted my perl script to use %26 instead. In fact, I've adjusted the code entirely so that now the php code reads this:

Code:

<?php

$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 0);
xml_set_element_handler($xml_parser, "start_element", "end_element");
xml_set_character_data_handler($xml_parser, "characters");

$file = "sample.xml";
if ($file_stream = fopen($file, "r")) {

  while ($data = fread($file_stream, 4096)) {

      $this_chunk_parsed = xml_parse($xml_parser, $data, feof($file_stream));
      if (!$this_chunk_parsed) {
          $error_code = xml_get_error_code($xml_parser);
          $error_text = xml_error_string($error_code);
          $error_line = xml_get_current_line_number($xml_parser);

          $output_text = "Parsing problem at line $error_line: $error_text";
          die($output_text);
      }

  }

} else {

    die("Can't open XML file.");

}
xml_parser_free($xml_parser);


// Functions

function start_element($parser, $name, $attrs) {
    print "<b>Start Element:</b> $name<br />";
    print "<b>---Attributes:</b> <br />";
    foreach ($attrs as $key => $value) {
        print "$key = $value<br />";
    }
    print "<br />";
}

function end_element($parser, $name) {
    print "<b>End Element:</b> $name<br /><br />";
}

function characters($parser, $chars) {
                /* Parse % codes to normal HTML:
                  %26  &
                  %3F  ?
                  %3D  =
                  %2B  +
                  %2F  /
                */
                $chars = str_replace(array("%26"), "&amp;", $chars);
                $chars = str_replace(array("%3F"), "?", $chars);
                $chars = str_replace(array("%3D"), "=", $chars);
                $chars = str_replace(array("%2B"), "+", $chars);
                $chars = str_replace(array("%2F"), "/", $chars);
    print "<p><i>$chars</i></p>";
}

?>

Now it will work (note the changes in function characters).

This will work with a very large php file. But I did manage to cut it down to 11.6MB by adjusting the format. It won't work if I read it in all at once but using SAX it does work.


All times are GMT -5. The time now is 11:03 PM.