简体   繁体   中英

How to speed up the XML DTD validation with PHP?

I am walidating my XML with a DTD file I have locally.

For that, I am doing:

$xml                = $dmsMerrin.'/xml/'.$id.'/conversion.xml';
$dtd                = $dmsMerrin.'/style_files/journalpublishing.dtd';

$dom = new DOMDocument();
@$dom->load($xml);

libxml_use_internal_errors(true);

if (@$dom->validate()) {
    $htmlDTDError .= "<h2>No Errors Found - The tested file is Valid !</h2>";
} 
else {
    $errors = libxml_get_errors();
    $htmlDTDError .= '<h2>Errors Found ('.count($errors).')</h2><ol>';

    foreach ($errors as $error) {
        $htmlDTDError .= '<li>'.$error->message.' on line '.$error->line. '</li>';
    }

    $htmlDTDError .= '</ol>';
    libxml_clear_errors();
}

libxml_use_internal_errors(false);

And this takes about 30sec for an XML with 1600 lines.

Is this a usual time? Should be much faster in my opinion?

As you can see, the DTD I am using is locally on the server.

Any idea? Thank you.

EDIT: By debuging and checking the execution time, I noticed that it takes the same time if my xml has 1600 lines or 150 lines, so the problem is not the xml size.

And this takes about 30sec for an XML with 1600 lines.

That's an unusually long time, and it's likely due to misconfiguration.

By debuging and checking the execution time, I noticed that it takes the same time if my xml has 1600 lines or 150 lines, so the problem is not the xml size.

For a tool that may provide more diagnostics here, try xmllint --valid . It will show, for example, errors for any DTDs that could not be retrieved.

It's very likely that the extra time is due to fetching resources, such as the DTD, needed to perform validation.

For one of your files, confirm that the URL of the DTD can be retrieved quickly by testing with a tool like curl from the same server. Is it a complex DTD? Does it bring in other files? Especially, make sure that it never refers to resources that would have to be fetched from the web, or with hostnames where DNS resolves slowly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM