[英]Reading a huge text file and appending using Stringbuilder in Java
有一個巨大的xml文件(3-4GB)(360000行記錄),必須讀取每一行並使用Stringbuilder附加每一行。讀取后將對其進行進一步處理。 但是由於stringbuilder緩沖區大小超出限制,將無法存儲在內部存儲器中。 如何拆分記錄並在緩沖區大小超出之前休息。 請提示。
try {
File file = new File("test.txt");
FileReader fileReader = new FileReader(file);
BufferedReader bufferedReader = new BufferedReader(fileReader);
String builder stringBuilder = new Stringbuilder ();
String line;
int count =0;
while ((line = bufferedReader.readLine()) != null)`enter code here`
{
if (line.startswith("<customer>") ){
stringBuilder .append(line);
}
count++;
}
fileReader.close();
System.out.println(stringBuilder .toString());
} catch (IOException e) {
e.printStackTrace();
}
編輯:Asker嘗試與StAX
while (xmlEventReader.hasNext()) {
XMLEvent xmlEvent = null;
try {
xmlEvent = xmlEventReader.nextEvent();
} catch (Exception e) {
e.printStackTrace();
}
if (xmlEvent.isStartElement()) {
StartElement elem = (StartElement) xmlEvent;
if (elem.getName().getLocalPart().equals("<Customer>")) {
if (customerRecord) {
insideChildRecord = true;
}
customerRecord = true;
}
}
if (customerRecord) {
xmlEventWriter.add(xmlEvent);
}
if (xmlEvent.isEndElement()) {
EndElement elem = (EndElement) xmlEvent;
if (elem.getName().getLocalPart().equals("<Customer>")) {
if (insideChildRecord) {
insideChildRecord = false;
} else {
customerRecord = false;
xmlEventWriter.flush();
String cmlChunk = stringWriter.toString()
看起來您正在解析XML文件(因為我看到您正在檢查“ <customer>”)。
為此,最好使用解析庫而不是低級流。 由於文件很大,因此我建議為此使用SAX或StAX: https : //docs.oracle.com/javase/tutorial/jaxp/stax/index.html
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
while(xmlEventReader.hasNext()) {
XMLEvent xmlEvent = xmlEventReader.nextEvent();
// parse the XML events one by one
由於您無法將數據存儲在內存中,因此您必須立即對XML事件進行所有“進一步處理”。
也許這將使如何使用StAX更清晰:
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream("huge-file.xml"));
// this variable is re-used to store the current customer
Customer customer = null;
while (xmlEventReader.hasNext()) {
XMLEvent xmlEvent = xmlEventReader.nextEvent();
if (xmlEvent.isStartElement()) {
StartElement startElement = xmlEvent.asStartElement();
if (startElement.getName().getLocalPart().equalsIgnoreCase("customer")) {
// start populating a new customer
customer = new Customer();
// read an attribute for example <customer number="42">
Attribute attribute = startElement.getAttributeByName(new QName("number"));
if (attribute != null) {
customer.setNumber(attribute.getValue());
}
}
// read a nested element for example:
// <customer>
// <name>John Doe</name>
if(startElement.getName().getLocalPart().equals("name")){
xmlEvent = xmlEventReader.nextEvent();
customer.setName(xmlEvent.asCharacters().getData());
}
}
if (xmlEvent.isEndElement()) {
EndElement endElement = xmlEvent.asEndElement();
if(endElement.getName().getLocalPart().equalsIgnoreCase("customer")){
// all data for the current Customer has been read
// do something with the customer, like logging it or storing it in a database
// after this the customer variable will be re-assigned to the next customer
}
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.