I need to index some XML documents which make use of namespaces, such as:
<pm:Kroot>
<pm:root>
<pm:meta>
<dc:id xmlns:dc="http://purl.org/dc/elements/1.1/">1</dc:id>
<dc:source>
<dc:source>
<pm:link pm:description="Tele" pm:source="8326"/>
</dc:source>
</dc:source>
</pm:meta>
</pm:root>
</pm:Kroot>
Now when I use the the below DataImport, Solr manages to get the ID but it fails to index the Attributes values:
<dataConfig>
<dataSource type="FileDataSource" encoding="UTF-8" />
<document>
<entity name="article"
url="/sample.xml"
processor="XPathEntityProcessor"
stream="true"
forEach="/Kroot/root" >
<field column="id" xpath="/Kroot/root/meta/id" />
<field column="news_id" xpath="/Kroot/root/meta/source/source/link/@source" />
<field column="news_name" xpath="/Kroot/root/meta/source/source/link/@description" />
</entity>
</document>
By removing the attribute namespaces from the XML file, Solr manages to index all the data! Now I am searching for a solution and I cant find an explanation for this behavior. The Solr wiki says that in case of namespaces we should use only the attribute name without the namespace, just like I do. I am using Solr 4.1 btw.
您可以尝试例如/Kroot/root/meta/source/source/link[@description]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.