[英]Read XML line by line without loading whole file to memory
This is structure of my XML:这是我的 XML 的结构:
<?xml version="1.0" encoding="utf-8"?>
<posts>
<row Id="4" PostTypeId="1" AcceptedAnswerId="7" CreationDate="2008-07-31T21:42:52.667" Score="756" ViewCount="63468" Body="<p>I want to use a <code>Track-Bar</code> to change a <code>Form</code>'s opacity.</p>
<p>This is my code:</p>
<pre class="lang-cs prettyprint-override"><code>decimal trans = trackBar1.Value / 5000;
this.Opacity = trans;
</code></pre>
<p>When I build the application, it gives the following error:</p>
<blockquote>
<pre class="lang-none prettyprint-override"><code>Cannot implicitly convert type decimal to double
</code></pre>
</blockquote>
<p>I have tried using <code>trans</code> and <code>double</code>, but then the <code>Control</code> doesn't work. This code worked fine in a past VB.NET project.</p>
" OwnerUserId="8" LastEditorUserId="3072350" LastEditorDisplayName="Rich B" LastEditDate="2021-02-26T03:31:15.027" LastActivityDate="2021-11-15T21:15:29.713" Title="How to convert a Decimal to a Double in C#?" Tags="<c#><floating-point><type-conversion><double><decimal>" AnswerCount="12" CommentCount="4" FavoriteCount="59" CommunityOwnedDate="2012-10-31T16:42:47.213" ContentLicense="CC BY-SA 4.0" />
<row Id="6" PostTypeId="1" AcceptedAnswerId="31" CreationDate="2008-07-31T22:08:08.620" Score="313" ViewCount="22477" Body="<p>I have an absolutely positioned <code>div</code> containing several children, one of which is a relatively positioned <code>div</code>. When I use a <code>percentage-based width</code> on the child <code>div</code>, it collapses to <code>0 width</code> on IE7, but not on Firefox or Safari.</p>
<p>If I use <code>pixel width</code>, it works. If the parent is relatively positioned, the percentage width on the child works.</p>
<ol>
<li>Is there something I'm missing here?</li>
<li>Is there an easy fix for this besides the <code>pixel-based width</code> on the child?</li>
<li>Is there an area of the CSS specification that covers this?</li>
</ol>
" OwnerUserId="9" LastEditorUserId="9134576" LastEditorDisplayName="user14723686" LastEditDate="2021-01-29T18:46:45.963" LastActivityDate="2021-01-29T18:46:45.963" Title="Why did the width collapse in the percentage width child element in an absolutely positioned parent on Internet Explorer 7?" Tags="<html><css><internet-explorer-7>" AnswerCount="7" CommentCount="0" FavoriteCount="13" ContentLicense="CC BY-SA 4.0" />
</posts>
Can I load every row
one by one without loading whole XML file into memory?我可以
row
加载而不将整个 XML 文件加载到 memory 中吗? For example printing all of the titles例如打印所有的标题
Providing the XML file is structured exactly as shown in the example then BeautifulSoup could be used to parse relevant lines.如果 XML 文件的结构与示例中所示的完全相同,则 BeautifulSoup 可用于解析相关行。 Something like this:
像这样:
from bs4 import BeautifulSoup as BS
with open('my.xml') as xml:
for line in map(str.strip, xml):
if line.startswith('<row'):
soup = BS(line, 'lxml')
if row := soup.find('row'):
if title := row.get('title'):
print(title)
"Lines" in XML are pretty irrelevant; XML 中的“行”是无关紧要的; the relevant units are things like elements, attributes, start tags, end tags.
相关单位是元素、属性、开始标签、结束标签等。
A streaming parser (often called a SAX parser, though strictly speaking SAX is a Java API) will deliver the document to the application incrementally, not one line at a time, but one syntactic unit at a time.流式解析器(通常称为 SAX 解析器,尽管严格来说 SAX 是一个 Java API)将递增地向应用程序交付文档,不是一次一行,而是一次一个语法单元。
See for example Python SAX Parser参见例如Python SAX 解析器
You can try something like this:你可以尝试这样的事情:
while line:= file.readline():
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.