[英]How can I solve this particular “TypeError: Type 'NoneType' cannot be serialized.” error?
First, a brief description of the problem: Within an unordered list, we have many list items, each of which correspond to a "flashcard" 首先,问题的简要描述:在无序列表中,我们有许多列表项,每个列表项对应一个“闪卡”
<ul>
<li>
<p><span>can you slice columns in a 2d list? </span></p>
<pre><code class='language-python' lang='python'>queryMatrixTranspose[a-1:b][i] = queryMatrix[i][a-1:b] </code></pre>
<ul>
<li>
<span>No: can't do this because python doesn't support multi-axis slicing, only multi-list slicing; see the article </span><a href='http://ilan.schnell-web.net/prog/slicing/' target='_blank' class='url'>http://ilan.schnell-web.net/prog/slicing/</a><span> for more info.</span>
</li>
</ul>
</li>
</ul>
The answer on the flashcard will always be a list item located under the xpath: /html/body/ul/li/ul
. 闪存卡上的答案将始终是位于xpath下的列表项: /html/body/ul/li/ul
。 I'd like to retrieve the answer in the format shown here 我想以这里显示的格式检索答案
<li>
<span>No: can't do this because python doesn't support multi-axis slicing, only multi-list slicing; see the article </span><a href='http://ilan.schnell-web.net/prog/slicing/' target='_blank' class='url'>http://ilan.schnell-web.net/prog/slicing/</a><span> for more info.</span>
</li>
The flashcard's question is everything that remains in the xpath: /html/body/ul/li
after the answer has been extracted: flashcard的问题是在提取答案后仍保留在xpath: /html/body/ul/li
中的所有内容:
<li>
<p><span>can you slice columns in a 2d list? </span></p>
<pre><code class='language-python' lang='python'>queryMatrixTranspose[a-1:b][i] = queryMatrix[i][a-1:b] </code></pre>
</li>
For each flashcard in an unordered list of flashcards, I'd like to extract the utf-8
encoded html content of the question and answer list items. 对于无序的抽认卡列表中的每个闪卡,我想提取问题和答案列表项的utf-8
编码的html内容。 That is, I'd like to have both the text and html tags. 也就是说,我想同时拥有text和html标签。
I tried to solve this problem by iterating through each flashcard and corresponding answer and removing the child-node answer from the parent-node flashcard. 我试图通过迭代每个闪卡和相应的答案并从父节点闪存卡中删除子节点答案来解决这个问题。
flashcard_list = []
htmlTree = html.fromstring(htmlString)
for flashcardTree,answerTree in zip(htmlTree.xpath("/html/body/ul/li"),
htmlTree.xpath('/html/body/ul/li/ul')):
flashcard = html.tostring(flashcardTree,
pretty_print=True).decode("utf-8")
answer = html.tostring(answerTree,
pretty_print=True).decode("utf-8")
question = html.tostring(flashcardTree.remove(answerTree),
pretty_print=True).decode("utf-8")
flashcard_list.append((question,answer))
However, when I try to remove the answer child-node with flashcardTree.remove(answerTree)
, I encounter the error, TypeError: Type 'NoneType' cannot be serialized.
但是,当我尝试使用flashcardTree.remove(answerTree)
删除答案子节点时,我遇到错误, TypeError: Type 'NoneType' cannot be serialized.
I don't understand why this function would return none; 我不明白为什么这个函数不会返回; I'm trying to remove a node at /html/body/ul/li/ul
which is a valid child node of /html/body/ul/li
. 我想删除一个节点/html/body/ul/li/ul
这是一个有效的子节点/html/body/ul/li
。
Whatever suggestions you have would be greatly appreciated. 无论你有什么建议,我将不胜感激。 I'm not in any way attached to the code I wrote in my first attempt; 我不会以任何方式依赖我在第一次尝试时写的代码; I'll accept any answer where the output is a list of (question,answer) tuples, one for each flashcard. 我会接受任何答案,其中输出是(问题,答案)元组的列表,每个闪卡一个。
If I understand correctly what you are looking for, this should work: 如果我理解你正在寻找什么,这应该工作:
for flashcardTree,answerTree in zip(htmlTree.xpath("/html/body/ul/li/p/span"),
htmlTree.xpath('/html/body/ul/li/ul/li/descendant-or-self::*')):
question = flashcardTree.text
answer = answerTree.text_content().strip()
flashcard_list.append((question,answer))
for i in flashcard_list:
print(i[0],'\n',i[1])
Output: 输出:
can you slice columns in a 2d list? 你可以在2d列表中切片列吗?
No: can't do this because python doesn't support multi-axis slicing, only multi-list slicing; 否:不能这样做是因为python不支持多轴切片,只支持多列切片; see the article http://ilan.schnell-web.net/prog/slicing/ for more info. 有关详细信息,请参阅文章http://ilan.schnell-web.net/prog/slicing/ 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.