简体   繁体   English

解析XML文件的内部标签

[英]parsing Inner tags of XML file

I need to parse XML file to get the values of tags present in the xml file. 我需要解析XML文件以获得xml文件中存在的标签的值。 I have done it partially and got stuck up in the mid. 我已经完成了一部分,并陷入了中间。 my xml file is as follows, (sample xml file) 我的xml文件如下,(示例xml文件)

<?xml version="1.0" encoding="UTF-8"?>
<database>
<student name="abc">
<phone>6879987</phone>
<dept>eee</dept>
<college>act</college>
<semester>
<year>2</year>
<no.of.sub>7</no.of.sub>
</semester>
<hostel>
<year>3</year>
<block>d4</block>
</hostel>
</student>
<student name="ram">
<phone>65464</phone>
<dept>cse</dept>
<college>Mit</college>
<semester>
<year>4</year>
<no.of.sub>5</no.of.sub>
</semester>
<hostel>
<year>5</year
<block>y4</block>
</hostel>
</student> 
</database>

My implementation is as follows, 我的实现如下

   public class MySaxParser extends DefaultHandler  {
   private String temp;
   private ArrayList<Account> accList = new ArrayList<Account>();
   private Account acct;


   public static void main(String[] args) throws IOException, SAXException,
                 ParserConfigurationException {


          SAXParserFactory spfac = SAXParserFactory.newInstance();


          SAXParser sp = spfac.newSAXParser();


          MySaxParser handler = new MySaxParser();


          sp.parse("test.xml", handler);

          handler.readList();

   }



   @Override
   public void characters(char[] buffer, int start, int length) {
          temp = new String(buffer, start, length);
   }



   @Override
   public void startElement(String uri, String localName,
                 String qName, Attributes attributes) throws SAXException {
          temp = "";
          if (qName.equalsIgnoreCase("Student")) {
                 acct = new Account();
                 //acct.setType(attributes.getValue("type"));

          }
   }

   @override
   public void endElement(String uri, String localName, String qName)
                 throws SAXException {

          if (qName.equalsIgnoreCase("Student")) {
                 // add it to the list
                 accList.add(acct);

          } else if (qName.equalsIgnoreCase("phone")) {
                 acct.setphone(temp);
          } else if (qName.equalsIgnoreCase("dept")) {
                 acct.setdept((temp));
          } else if (qName.equalsIgnoreCase("College")) {
                 acct.setcollege(temp);

          }
   }



   private void readList() {

          Iterator<Account> it = accList.iterator();
          while (it.hasNext()) {
                 System.out.println(it.next().toString());
          }
   }

 }

It is possible for me to parse the values of phone,dept college. 我有可能解析电话系的价值。 But year tag is the subtag of both semester and hostel. 但是,年份标记是学期和旅馆的子标记。 I need to get both the year value. 我需要同时获取年份值。 when i simply use, 当我简单地使用时

    else if (qName.equalsIgnoreCase("year")) {
    acct.setyear(temp); 

only year values of hostel is getting printed skipping the semester. 跳过学期仅打印宿舍的年份值。 1) how can i parse through these sub tags. 1)我该如何解析这些子标签。 Thanks in advance 提前致谢

Yes, that's because hostel tag comes after semester. 是的,这是因为宿舍标签是在学期后开始的。 So, you always end up having the most recent tag. 因此,您总是总是拥有最新标签。 If you want to print both, you use some boolean flags like isSemester and isHotel and set these as they encounter. 如果要同时打印两个,请使用一些布尔标志,例如isSemester和isHotel,并在遇到它们时进行设置。 Then when you encounter year, check these flags and see to which one the "current year" is related to. 然后,当您遇到年份时,请检查这些标志并查看“当前年份”与哪个标志有关。

You need a way to "remember" whether you have entered a hostel tag or not. 您需要一种“记住”您是否已输入hostel标签的方法。 Simplest way would be to do the following inside startElement : 最简单的方法是在startElement内部执行以下操作:

 if (qName.equalsIgnoreCase("hostel")) {
     this.insideHostel = true;
 }

And in endElement : endElement

 if (qName.equalsIgnoreCase("hostel")) {
     this.insideHostel = false;
 }

Now you can do: 现在您可以执行以下操作:

 } else if (qName.equalsIgnoreCase("year") && insideHostel) {
     acct.setyear(temp); 
 }

sample xml file, 样本xml文件,

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Issue>
    <Primary>
    <Section>      
             String a=null;
             if(a==null) hi;);
             else bye;;
              }
    </Section>
    </Primary>
    <Source>
    <Section>
              String a="sbi";
              if(a==null) hi;);
               else bye;;
    </Section>
    </Source>
    </Issue>

I need to get all the characters within the section tag. 我需要获取section标记内的所有字符。 My implementation as follows, 我的实现如下,

     public void endElement(String uri, String localName, String qName)
             throws SAXException {

      if (qName.equalsIgnoreCase("Issue")) {

             accList.add(acct);

      }
      if (qName.equalsIgnoreCase("Primary")) {

            this.isPrimary=false;

      }
      if (qName.equalsIgnoreCase("Source")) {
             this.isSource=false;

      }

    else if(qName.equalsIgnoreCase("section")&&isPrimary)
           {
               acct.setPrimarySnippet(temp);
           }
          else if(qName.equalsIgnoreCase("Section")&&isSource)
           {
               acct.setSourceSnippet(temp);
           }

o/p is : else bye;; o / p是:else再见; }, },

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM