[英]In DomDocument, reuse of DOMXpath, it is stable?
I am using the function below, but not sure about it is always stable/secure... Is it? 我正在使用下面的功能,但不确定它总是稳定/安全...... 是吗?
When and who is stable/secure to "reuse parts of the DOMXpath preparing procedures"? 什么时候和谁稳定/安全“重用DOMXpath准备程序的部分”?
To simlify the use of the XPath query() method we can adopt a function that memorizes the last calls with static variables, 为了简化XPath query()方法的使用,我们可以采用一个函数来记忆最后一次使用静态变量的调用,
function DOMXpath_reuser($file) {
static $doc=NULL;
static $docName='';
static $xp=NULL;
if (!$doc)
$doc = new DOMDocument();
if ($file!=$docName) {
$doc->loadHTMLFile($file);
$xp = NULL;
}
if (!$xp)
$xp = new DOMXpath($doc);
return $xp; // ??RETURNED VALUES ARE ALWAYS STABLE??
}
The present question is similar to this other one about XSLTProcessor reuse. 本问题类似于关于XSLTProcessor重用的另一个问题。 In both questions the problem can be generalized for any language or framework that use LibXML2 as DomDocument implementation. 在这两个问题中,对于使用LibXML2作为DomDocument实现的任何语言或框架,可以推广该问题。
There are another related question: How to "refresh" DOMDocument instances of LibXML2? 还有另一个相关的问题: 如何“刷新”LibXML2的DOMDocument实例?
The reuse is very commom (examples): 重用非常普遍(例子):
$f = "my_XML_file.xml";
$elements = DOMXpath_reuser($f)->query("//*[@id]");
// use elements to get information
$elements = DOMXpath_reuser($f)->("/html/body/div[1]");
// use elements to get information
But, if you do something like removeChild
, replaceChild
, etc. (example), 但是,如果您执行removeChild
, replaceChild
等操作(例如),
$div = DOMXpath_reuser($f)->query("/html/body/div[1]")->item(0); //STABLE
$div->parentNode->removeChild($div); // CHANGES DOM
$elements = DOMXpath_reuser($f)->query("//div[@id]"); // INSTABLE! !!
extrange things can be occur , and the queries not works as expected!! 可以发生外部事件 ,并且查询无法正常工作!!
DOMXpath is affected by the load*() methods on DOMDocument. DOMXpath受DOMDocument上的load *()方法的影响。 After loading a new xml or html, you need to recreate the DOMXpath instance: 加载新的xml或html后,需要重新创建DOMXpath实例:
$xml = '<xml/>';
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
var_dump($xpath->document === $dom); // bool(true)
$dom->loadXml($xml);
var_dump($xpath->document === $dom); // bool(false)
In DOMXpath_reuser() you store a static variable and recreate the xpath depending on the file name. 在DOMXpath_reuser()中,存储静态变量并根据文件名重新创建xpath。 If you want to reuse an Xpath object, suggest extending DOMDocument. 如果要重用Xpath对象,建议扩展DOMDocument。 This way you only need pass the $dom variable around. 这样你只需要传递$ dom变量。 It would work with a stored xml file as well with xml string or a document your are creating. 它可以使用存储的xml文件以及xml字符串或您正在创建的文档。
The following class extends DOMDocument with an method xpath() that always returns a valid DOMXpath instance for it. 以下类使用方法xpath()扩展DOMDocument,该方法始终为其返回有效的DOMXpath实例。 It stores and registers the namespaces, too: 它也存储和注册命名空间:
class MyDOMDocument
extends DOMDocument {
private $_xpath = NULL;
private $_namespaces = array();
public function xpath() {
// if the xpath instance is missing or not attached to the document
if (is_null($this->_xpath) || $this->_xpath->document != $this) {
// create a new one
$this->_xpath = new DOMXpath($this);
// and register the namespaces for it
foreach ($this->_namespaces as $prefix => $namespace) {
$this->_xpath->registerNamespace($prefix, $namespace);
}
}
return $this->_xpath;
}
public function registerNamespaces(array $namespaces) {
$this->_namespaces = array_merge($this->_namespaces, $namespaces);
if (isset($this->_xpath)) {
foreach ($namespaces as $prefix => $namespace) {
$this->_xpath->registerNamespace($prefix, $namespace);
}
}
}
}
$xml = <<<'ATOM'
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Test</title>
</feed>
ATOM;
$dom = new MyDOMDocument();
$dom->registerNamespaces(
array(
'atom' => 'http://www.w3.org/2005/Atom'
)
);
$dom->loadXml($xml);
// created, first access
var_dump($dom->xpath()->evaluate('string(/atom:feed/atom:title)', NULL, FALSE));
$dom->loadXml($xml);
// recreated, connection was lost
var_dump($dom->xpath()->evaluate('string(/atom:feed/atom:title)', NULL, FALSE));
The DOMXpath
class (instead of XSLTProcessor in your another question ) use reference to given DOMDocument
object in contructor. DOMXpath
类(而不是另一个问题中的XSLTProcessor)在构造函数中使用对给定DOMDocument
对象的引用。 DOMXpath
create libxml
context object based on given DOMDocument
and save it to internal class data. DOMXpath
基于给定的DOMDocument
创建libxml
上下文对象,并将其保存到内部类数据。 Besides libxml
context it s saves references to original
DOMDocument` given in contructor arguments. 除了libxml
上下文之外,它还s saves references to original
contructor参数中给出的s saves references to original
DOMDocument`的s saves references to original
。
What that means: 那意味着什么:
Part of sample from ThomasWeinert answer: 部分样本来自ThomasWeinert回答:
var_dump($xpath->document === $dom); // bool(true)
$dom->loadXml($xml);
var_dump($xpath->document === $dom); // bool(false)
gives false after load becouse of $dom
already holds pointer to new libxml
data but DOMXpath
holds libxml
context for $dom
before load and pointer to real document after load. 由于$dom
已经保存了指向新libxml
数据的指针,但DOMXpath
在加载之前保存了$dom
libxml
上下文,并且在加载DOMXpath
保存了指向真实文档的指针。
Now about query
works 现在关于query
工作
If it should return XPATH_NODESET
(as in your case) its make a node copy - node by node iterating throw detected node set( \\ext\\dom\\xpath.c
from 468 line). 如果它应该返回XPATH_NODESET
(如你的情况那样), XPATH_NODESET
做一个节点拷贝 - 逐个节点迭代抛出检测到的节点集(从468行开始的\\ext\\dom\\xpath.c
)。 Copy but with original document node as parent . 复制但原始文档节点为父级 。 Its means that you can modify result but this gone away you XPath and DOMDocument connection. 它意味着您可以修改结果但这消失了您的XPath和DOMDocument连接。
XPath results provide a parentNode memeber that knows their origin: XPath结果提供了一个知道其来源的parentNode memeber:
So, 所以,
XPath
. 没有任何理由缓存XPath
。 It do not anything besides xmlXPathNewContext
(just allocate lightweight internal struct ). 它除了xmlXPathNewContext
之外没有任何东西(只是分配轻量级内部结构 )。 DOMDocument
(removeChild, replaceChild, etc.) your should recreate XPath
. 每次修改DOMDocument
(removeChild,replaceChild等)时,都应该重新创建XPath
。 xmlXPathNewContext
created in Xpath
constructor. 我们不能使用像normalizeDocument这样的东西来“刷新DOM”,因为它改变了内部文档结构并使在Xpath
构造函数中创建的xmlXPathNewContext
无效。 Xpath
usage. 是的,如果你没有在Xpath
使用之间更改$ doc。 Need to reload $doc also - no, because of it invalidated previously created xmlXPathNewContext
. 还需要重新加载$ doc - 否,因为它使以前创建的xmlXPathNewContext
无效。 (this is not a real answer, but a consolidation of comments and answers posted here and related questions) (这不是一个真正的答案,而是在此处发布的评论和答案的合并及相关问题)
This new version of the question's DOMXpath_reuser
function contains the @ThomasWeinert suggestion (for avoid DOM changes by external re- load
) and an option $enforceRefresh
to workaround the problem of instability (as related question shows the programmer must detect when ). 问题的DOMXpath_reuser
函数的这个新版本包含@ThomasWeinert建议(用于避免外部重新load
DOM更改)和一个选项$enforceRefresh
来解决不稳定性问题(因为相关问题显示程序员必须检测何时 )。
function DOMXpath_reuser_v2($file, $enforceRefresh=0) { //changed here
static $doc=NULL;
static $docName='';
static $xp=NULL;
if (!$doc)
$doc = new DOMDocument();
if ( $file!=$docName || ($xp && $doc !== $xp->document) ) { // changed here
$doc->load($file);
$xp = NULL;
} elseif ($enforceRefresh==2) { // add this new refresh mode
$doc->loadXML($doc->saveXML());
$xp = NULL;
}
if (!$xp || $enforceRefresh==1) //changed here
$xp = new DOMXpath($doc);
return $xp;
}
... perhaps an open problem, only little tips and clues... ...也许是一个开放的问题,只有一些提示和线索......
... perhaps an open problem, only little tips and clues... ...也许是一个开放的问题,只有一些提示和线索......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.