简体   繁体   中英

Using XMLReader with Google App Engine Cloud Storage

I have a 30MB XML file which I'd like to process using Google App Engine (PHP). Because the file is so big, the suggested storage is Google Cloud Storage, so I've placed it there. Because of memory constraints, I can't parse the whole file at once, but it contains 5000 nodes which are all very reasonably-sized, so I'm trying to use XML Reader to pull in one node at a time.

The process works perfectly locally, but the issue I'm having is that XMLReader keeps failing to read from my cloud storage with the message "unable to open source data" .

Here's an example of my code:

$path = "gs://my_bucket/my_file.xml";
require_once 'google/appengine/api/cloud_storage/CloudStorageTools.php';
use google\appengine\api\cloud_storage\CloudStorageTools;
$public_url = CloudStorageTools::getPublicUrl($path, true);

$reader = new XMLReader;
$reader->open( $path ); // fails
$reader->open( $public_url ); // fails

Both the "internal" and the public URL fail with the same error:

XMLReader::open(): Unable to open source data in /[gaepath]/myapp.php on line X

Having read around, there are suggestions about permissions, but the file is not restricted and the following does work:

$xml = file_get_contents($path); // $xml contains the file contents as a string

Two solutions would help me:

  1. Some way to have XMLReader open a Google Cloud Storage URL
  2. Some way to pass a string to XMLReader, which does not appear to be possible (and writing a temporary local file also appears to be forbidden on GAE)

I had this same problem. Looks like we need to manually enable a " Disabled Function " by creating a php.ini file in the root of our app.

php.ini:

google_app_engine.enable_functions = "libxml_disable_entity_loader"

Then, in the code, we need to enable the entity loader before loading the file:

libxml_disable_entity_loader(false);
$xml = new XMLReader();
$res = $xml->open($file);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM