[英]Java. Moving in a xml with the same tag name as child
The problem I have is that I have to work with an xml file the providers of the company I work for sent to me. 我的问题是我必须使用一个XML文件,将我工作的公司的提供者发送给我。
This would not be a problem if the xml was well constructed but it is not at all. 如果xml构造良好,但根本没有问题,这将不是问题。
<catalog>
<product>
<ref>4780</ref>
.
.
.
<arrivals>
<product>
<image title="AMARILLO">AMA</image>
<size>S/T </size>
</product>
<product>
<image title="AZUL">AZUL</image>
<size>S/T </size>
</product>
</arrivals>
</product>
</catalog>
As you can see, the tag <product>
have all the information of the product but there are more tags named <product>
to distinguish when there are different colors. 如您所见,标签
<product>
具有<product>
所有信息,但是还有更多名为<product>
标签可以区分不同的颜色。
This is the code I use to move in the xml. 这是我用来在xml中移动的代码。
doc = db.parse("filename.xml");
Element esproducte = (Element)doc.getElementsByTagName("product").item(0);
NodeList nArrv = esproducte.getElementsByTagName("arrivals");
Element eArrv = (Element) nArrv.item(0);
NodeList eProds = eArrv.getElementsByTagName("product");//THIS THING
for(int l=0; l<eProds.getLength(); l++)
{
Node ln = eProds.item(l);
if (ln.getNodeType() == Node.ELEMENT_NODE)
{
Element le = (Element) ln;
//COLORS / IMAGES / CONFIGS
NodeList nimgcol = le.getElementsByTagName("image");
Element eimgcol = (Element) nimgcol.item(0);
System.out.println("Name of the color " + eimgcol.getTextContent());
}
What happens is that the print is reapeated more times it should and I think it's because of the parent <product>
. 发生的结果是,该印刷品被多次翻倍,我认为这是由于其父
<product>
。 I thought it shouldn't happen because where I wrote //THIS THING
I take into account the fact that <product>
is set in <arrivals>
. 我认为这不应该发生,因为我在
//THIS THING
编写//THIS THING
地方考虑到<product>
在<arrivals>
设置的事实。 But it is not working. 但这是行不通的。
What should I modify in the code to move only 2 times in the for and not 3, which is what happen in this case? 我应该在代码中修改什么,以便在for中仅移动2次,而不是3次,在这种情况下会发生什么?
Solution:
解:
NodeList eProds = eArrv.getElementsByTagName("product");//THIS THING
to 至
NodeList eProds = eArrv.getChildNodes();//THIS THING
And the rest exactly the same. 和其余的完全一样。 Works perfect.
完美的作品。
It is perfectly valid to have tags inside different parent elements that are named the same, but have different content/meaning, as is the case in your example. 就像在示例中一样,在不同的父元素中命名相同但具有不同内容/含义的标签是完全有效的 。
An element whose path is /catalog/product
is entirely different from an element whose path is /catalog/product/arrivals/product
. 路径为
/catalog/product
的元素与路径为/catalog/product/arrivals/product
的元素完全不同。 As an example, both XPath and XML Schema will consider them distinct. 例如, XPath和XML Schema都将它们视为不同的。
It is only lazily written code that cannot distinguish the difference, eg by using getElementsByTagName
, which locates elements anywhere ("all descendants") regardless of the location (path). 只是懒散编写的代码无法区分差异,例如,通过使用
getElementsByTagName
可以将元素定位在任何位置(“所有后代”),而与位置(路径)无关。
When processing the DOM tree, do it in a structured fashion: 处理DOM树时,请以结构化方式进行:
catalog
). catalog
)的所有子元素(不是所有后代)。 product
. product
,则失败。 product
: product
元素:
product
element. product
元素的所有子元素。 ref
, arrivals
. ref
, arrivals
。 arrivals
: arrivals
:
arrivals
element. arrivals
元素的所有子元素。 product
. product
,则失败。 product
: product
元素:
image
, size
. image
, size
。 As you can see, the place in your code that handles an element named product
inside an element named catalog
is different from the code that handles an element named product
inside an element named arrivals
. 如您所见,在代码中处理名称为
catalog
的元素中名为product
的元素的代码与处理名称为arrivals
元素中的product
的代码不同。
getElementsByTagName
give you all Tags with the name "product" that are inside that tag, including those "product" tags for colors. getElementsByTagName
为您提供该标签内所有名称为“ product”的标签,包括颜色的“ product”标签。 Try use getChildNodes
and check the name of the Nodes instead 尝试使用
getChildNodes
并检查节点的名称
As Andreas mentioned there is nothing invalid about the document and the problem is using getElementsByTagName, which simply scans the entire document for any elements with that tag name, regardless of structure. 正如Andreas提到的那样,文档没有任何问题,而问题在于使用了getElementsByTagName,它可以简单地在整个文档中扫描具有该标签名称的任何元素,而无论其结构如何。
You can use XPath to simplify the traversal of specific elements. 您可以使用XPath简化特定元素的遍历。
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.*;
import java.io.IOException;
import java.io.StringReader;
public class XMLParsing {
public static void main(String[] args) throws ParserConfigurationException, IOException, SAXException, XPathExpressionException {
String xml = "<catalog>\n" +
" <product>\n" +
" <ref>4780</ref>\n" +
" .\n" +
" .\n" +
" .\n" +
" <arrivals>\n" +
" <product>\n" +
" <image title=\"AMARILLO\">AMA</image>\n" +
" <size>S/T </size>\n" +
" </product>\n" +
" <product>\n" +
" <image title=\"AZUL\">AZUL</image>\n" +
" <size>S/T </size>\n" +
" </product>\n" +
" </arrivals>\n" +
" </product>\n" +
"</catalog>\n";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
// get all products under "arrivals"
XPathExpression expression = xPath.compile("/catalog/product/arrivals//product");
NodeList nodes = (NodeList) expression.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
Node product = nodes.item(i);
NodeList productChildren = product.getChildNodes();
for (int j = 0; j < productChildren.getLength(); j++) {
Node item = productChildren.item(j);
if (item instanceof Element) {
Element element = (Element) item;
switch (element.getTagName()) {
case "image":
System.out.println("product image title : " + element.getAttribute("title"));
break;
case "size":
System.out.println("product size : " + element.getTextContent());
break;
default:
break;
}
}
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.