简体   繁体   中英

How to read XML content as String from a file in Java

I get a text file from my source which has the below content in single line.

<employees><employee><id>101</id><name>Lokesh Gupta</name><title>Author</title></employee><employee><id>102</id><name>Brian Lara</name><title>Cricketer</title></employee></employees>

In my code, I have to read each employee data as String. Eg: <employee><id>101</id><name>Lokesh Gupta</name><title>Author</title></employee> as a string and <employee><id>102</id><name>Brian Lara</name><title>Cricketer</title></employee> as another string. When I print the content on console, it has to print <employee><id>101</id><name>Lokesh Gupta</name><title>Author</title></employee> . Could you please let me know how to do this?

Generally the file I get from my source consists of 100+ million employee details in single line and I have to read all those employee details as individual String and store that in other file. Since the file size is huge, I tried using SAX parser and with that I am able to parse xml content but unable to read the entire data as string.

I tried using SAX Parser and default handler to read this content. But in startElement and EndElement methods, I have to write my logic to append < and > , < and /> respectively. I want to know better way of reading this instead of writing logic to append the Angular brackets.

One way to do this is to use Streaming feature of JaxB which effectivly uses SAX underneath. Here is an example:

 // create JAXBContext for the primer.xsd
        JAXBContext context = JAXBContext.newInstance("primer");

        Unmarshaller unmarshaller = context.createUnmarshaller();

        // purchase order notification callback
        final PurchaseOrders.Listener orderListener = new PurchaseOrders.Listener() {
            public void handlePurchaseOrder(PurchaseOrders purchaseOrders, PurchaseOrderType purchaseOrder) {
                System.out.println("this order will be shipped to "
                        + purchaseOrder.getShipTo().getName());
            }
        };

        // install the callback on all PurchaseOrders instances
        unmarshaller.setListener(new Unmarshaller.Listener() {
            public void beforeUnmarshal(Object target, Object parent) {
                if(target instanceof PurchaseOrders) {
                    ((PurchaseOrders)target).setPurchaseOrderListener(orderListener);
                }
            }

            public void afterUnmarshal(Object target, Object parent) {
                if(target instanceof PurchaseOrders) {
                    ((PurchaseOrders)target).setPurchaseOrderListener(null);
                }
            }
        });

        // create a new XML parser
        SAXParserFactory factory = SAXParserFactory.newInstance();
        factory.setNamespaceAware(true);
        XMLReader reader = factory.newSAXParser().getXMLReader();
        reader.setContentHandler(unmarshaller.getUnmarshallerHandler());

        for (String arg : args) {
            // parse all the documents specified via the command line.
            // note that XMLReader expects an URL, not a file name.
            // so we need conversion.
            reader.parse(new File(arg).toURI().toString());
        }
    }
}

It is taken straight from jaxB samples in jaxb/ri https://github.com/javaee/jaxb-v2/blob/master/jaxb-ri/samples/src/main/samples/streaming-unmarshalling/src/Main.java

The PurchaseOrders.Listener interface is :

public static interface Listener {
        void handlePurchaseOrder(PurchaseOrders purchaseOrders, PurchaseOrderType purchaseOrder);
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM