[英]Pythonic way to conditionally iterate over items in a list
New to programming in general, so I'm probably going about this the wrong way. 一般来说是编程新手,所以我可能会以错误的方式解决这个问题。 I'm writing an lxml parser where I want to omit HTML table rows that have no content from the parser output. 我正在编写一个lxml解析器,我想省略没有解析器输出内容的HTML表行。 This is what I've got: 这就是我所拥有的:
for row in doc.cssselect('tr'):
for cell in row.cssselect('td'):
sys.stdout.write(cell.text_content() + '\t')
sys.stdout.write '\n'
The write()
stuff is temporary. write()
东西是临时的。 What I want is for the loop to only return rows where tr.text_content != ''
. 我想要的是循环只返回tr.text_content != ''
。 So I guess I'm asking how to write what my brain thinks should be 'for a in b if a != x' but that doesn't work. 所以我想我要问的是如何写出我的大脑认为应该是'如果a!= x',那么这是不行的。
Thanks! 谢谢!
for row in doc.cssselect('tr'):
cells = [ cell.text_content() for cell in row.cssselect('td') ]
if any(cells):
sys.stdout.write('\t'.join(cells) + '\n')
prints the line only if there is at least one cell with text content. 仅当至少有一个包含文本内容的单元格时才打印该行。
ReEdit : ReEdit :
You know, I really don't like my answer at all. 你知道,我真的不喜欢我的回答。 I voted up the other answer but I liked his original answer because not only was it clean but self explanatory without getting "fancy" which is what I fell victim to: 我投了另一个答案,但我喜欢他原来的答案,因为它不仅是干净而且是自我解释而没有得到“幻想”,这是我成为受害者:
for row in doc.cssselect('tr'):
for cell in row.cssselect('td'):
if(cel.text_content() != ''):
#do stuff here
there's not much more of an elegant solution. 没有更多优雅的解决方案。
Original-ish : Original-ish :
You can transform the second for
loop as follows: 您可以按如下方式转换第二个for
循环:
[cell for cell in row.cssselect if cell.text_content() != '']
and turn it into a list-comprehension. 并将其转化为列表理解。 That way you've got a prescreened list. 这样你就有了预先筛选的清单。 You can take that even farther by looking at the following example: 通过查看以下示例,您可以更进一步:
a = [[1,2],[2,3],[3,4]
newList = [y for x in a for y in x]
which transforms it into [1, 2, 2, 3, 3, 4]
. 它将其转换为[1, 2, 2, 3, 3, 4]
。 Then you can add in the if
statement at the end to screen out values. 然后,您可以在末尾添加if
语句以筛选出值。 Hence, you'd reduce that into a single line. 因此,您可以将其减少为一行。
Then again, if you were to look at itertools : 再说一次,如果你要看一下itertools :
ifilter(lambda x: x.text_content() != '', row.cssselect('td'))
produces an iterator which you can iterate over, skipping all items you don't want. 生成一个迭代器,你可以迭代,跳过你不想要的所有项目。
Edit : 编辑 :
And before I get more downvotes, if you're using python 3.0, filter
works the same way. 在我获得更多downvotes之前,如果你使用python 3.0, filter
工作方式相同。 No need to import ifilter
. 无需导入ifilter
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.