I have a 30MB XML file which I'd like to process using Google App Engine (PHP). Because the file is so big, the suggested storage is Google Cloud Storage, so I've placed it there. Because of memory constraints, I can't parse the whole file at once, but it contains 5000 nodes which are all very reasonably-sized, so I'm trying to use XML Reader to pull in one node at a time.
The process works perfectly locally, but the issue I'm having is that XMLReader keeps failing to read from my cloud storage with the message "unable to open source data" .
Here's an example of my code:
$path = "gs://my_bucket/my_file.xml";
require_once 'google/appengine/api/cloud_storage/CloudStorageTools.php';
use google\appengine\api\cloud_storage\CloudStorageTools;
$public_url = CloudStorageTools::getPublicUrl($path, true);
$reader = new XMLReader;
$reader->open( $path ); // fails
$reader->open( $public_url ); // fails
Both the "internal" and the public URL fail with the same error:
XMLReader::open(): Unable to open source data in /[gaepath]/myapp.php on line X
Having read around, there are suggestions about permissions, but the file is not restricted and the following does work:
$xml = file_get_contents($path); // $xml contains the file contents as a string
Two solutions would help me:
I had this same problem. Looks like we need to manually enable a " Disabled Function " by creating a php.ini file in the root of our app.
php.ini:
google_app_engine.enable_functions = "libxml_disable_entity_loader"
Then, in the code, we need to enable the entity loader before loading the file:
libxml_disable_entity_loader(false);
$xml = new XMLReader();
$res = $xml->open($file);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.