I need to parse extremely big XML file (near 50GB), how I can do it with Ruby? It's not possible to split it on chunks, I'v already tried.
I parsed a 40GB file using Nokogiri::XML::Reader
.
Structure of my XML file:
<?xml version="1.0" encoding="utf-8"?>
<posts>
<row Id="4">
<row Id="5">
</posts>
Code:
require 'nokogiri'
fname = "Posts.xml"
xml = Nokogiri::XML::Reader(File.open(fname))
xml.each do |posts|
posts.each do |post|
next if post.node_type == 14 # TYPE_SIGNIFICANT_WHITESPACE
# do something with post
end
end
I think the answer depends on how you plan to use the data. In my case, I simply needed to stream the post nodes.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.