[英]Multiple conditions in list comprehension
I have a list of nested dictionaries that looks as follows:我有一个嵌套字典列表,如下所示:
messages_all = [{'type': 'message',
'subtype': 'bot_message',
'text': "This content can't be displayed.",
'ts': '1573358255.000100',
'username': 'Userform',
'icons': {'image_30': 'www.example.com'},
'bot_id': 'JOD4K22SJW',
'blocks': [{'type': 'section',
'block_id': 'yCKUB',
'text': {'type': 'mrkdwn',
'text': 'Your *survey* has a new response.',
'verbatim': False}},
{'type': 'section',
'block_id': '37Mt4',
'text': {'type': 'mrkdwn',
'text': '*Thanks for your response. Where did you first hear about us?*\nFriend',
'verbatim': False}},
{'type': 'section',
'block_id': 'hqps2',
'text': {'type': 'mrkdwn',
'text': '*How would you rate your experience?*\n9',
'verbatim': False}},
{'type': 'section',
'block_id': 'rvi',
'text': {'type': 'mrkdwn', 'text': '*city*\nNew York', 'verbatim': False}},
{'type': 'section',
'block_id': 'q=L+',
'text': {'type': 'mrkdwn',
'text': '*order_id*\n123456',
'verbatim': False}}]},
{'type': 'message',
'subtype': 'channel_join',
'ts': '1650897290.290259',
'user': 'T01CTZE4MB6',
'text': '<@U03CTDZ4MA6> has joined the channel',
'inviter': 'A033AHJCK'},
{'type': 'message',
'subtype': 'channel_leave',
'ts': '1650899175.290259',
'user': 'T01CTZE4MB6',
'text': '<@U03CTDZ4MA6> has left the channel',
'inviter': 'A033AHJCK'},
{'client_msg_id': '123456jk-a19c-97fe-35c9-3c9f643cae19',
'type': 'message',
'text': '<@ABC973RJD>',
'user': 'UM1922AJG',
'ts': '1573323860.000300',
'team': 'B09AJR39A',
'reactions': [{'name': '+1', 'users': ['UM1927AJG'], 'count': 1}]},
{'client_msg_id': '1234CAC1-FEC8-4F25-8CE5-C135B7FJB2E',
'type': 'message',
'text': '<@UM1922AJG> ',
'user': 'UM1922AJG',
'ts': '1573791416.000200',
'team': 'AJCR23H',
'thread_ts': '1573791416.000200',
'reply_count': 3,
'reply_users_count': 2,
'latest_reply': '1573829538.002000',
'reply_users': ['UM3HRC74J', 'UM1922AJG'],
'is_locked': False,
'subscribed': False}
]
I'd like to be able to filter out dictionaries with the following我希望能够使用以下内容过滤掉字典
client_msg_id
channel_join
channel_leave
reply_users_count
My code to do so is:我这样做的代码是:
filtered_messages = [elem for elem in messages_all if not elem.get('client_msg_id')
or (elem.get('type') == 'message' and elem.get('subtype') == 'channel_join')
or (elem.get('type') == 'message' and elem.get('subtype') == 'channel_leave')
or (elem.get('type') == 'message' and elem.get('reply_users_count') == 2)
]
From testing, it seems as though only the client_msg_id
is being filtered out.从测试来看,似乎只有client_msg_id
被过滤掉了。 The others are not.其他人不是。
Would someone please assist me with the syntax of this list comprehension?有人可以帮助我理解这个列表理解的语法吗?
IIUC, you're simply missing parentheses to negate the union of all the conditions: IIUC,您只是缺少括号来否定所有条件的并集:
filtered_messages = [elem for elem in messages_all if not (elem.get('client_msg_id')
or (elem.get('type') == 'message' and elem.get('subtype') == 'channel_join')
or (elem.get('type') == 'message' and elem.get('subtype') == 'channel_leave')
or (elem.get('type') == 'message' and elem.get('reply_users_count') == 2))
]
This would keep only the first element of your input in the example.这将只保留示例中输入的第一个元素。
output: output:
[{'type': 'message', 'subtype': 'bot_message', 'text': "This content can't be displayed.", 'ts': '1573358255.000100', 'username': 'Userform', 'icons': {'image_30': 'www.example.com'}, 'bot_id': 'JOD4K22SJW', 'blocks': [{'type': 'section', 'block_id': 'yCKUB', 'text': {'type': 'mrkdwn', 'text': 'Your *survey* has a new response.', 'verbatim': False}}, {'type': 'section', 'block_id': '37Mt4', 'text': {'type': 'mrkdwn', 'text': '*Thanks for your response. Where did you first hear about us?*\nFriend', 'verbatim': False}}, {'type': 'section', 'block_id': 'hqps2', 'text': {'type': 'mrkdwn', 'text': '*How would you rate your experience?*\n9', 'verbatim': False}}, {'type': 'section', 'block_id': 'rvi', 'text': {'type': 'mrkdwn', 'text': '*city*\nNew York', 'verbatim': False}}, {'type': 'section', 'block_id': 'q=L+', 'text': {'type': 'mrkdwn', 'text': '*order_id*\n123456', 'verbatim': False}}]}
]
Like @mozway said, it is simply some parentheses missing.就像@mozway 所说的,只是少了一些括号。
For such a large if condition, I would personnally go further and create a function:对于这么大的 if 条件,我个人会进一步创建 go 并创建一个 function:
def my_filter(elem):
if not (elem.get('client_msg_id')
or (elem.get('type') == 'message' and elem.get('subtype') == 'channel_join')
or (elem.get('type') == 'message' and elem.get('subtype') == 'channel_leave')
or (elem.get('type') == 'message' and elem.get('reply_users_count') == 2)):
return True
return False
filtered_messages = [elem for elem in messages_all if my_filter(elem)]
Edit: delete extra boolean variable编辑:删除额外的 boolean 变量
Given the length of the resulting listcomp I would write something like this instead:鉴于生成的 listcomp 的长度,我会写这样的东西:
def filterdict(d):
subtypes = {"channel_join", "channel_leave"}
return any(
test(d)
for test in (
lambda d: d["type"] == "message" and d.get("subtype") in subtypes,
lambda d: d["type"] == "message" and d.get("reply_user_count") == 2,
lambda d: d.get("client_msg_id"),
)
)
msgs = [x for x in messages_all if not filterdict(x)]
In this form:在这种形式中:
False
for an interesting msg, so we can use it natively with itertools.filterfalse
我们有一个过滤器 fn,它会为有趣的消息返回False
,因此我们可以在本地使用它与itertools.filterfalse
all
ensures encapsulation of tests---a mislaid parenthesis is not going to cause the kind of problem which motivated the question lambdas 的使用和all
确保了测试的封装——错误放置的括号不会导致引发问题的那种问题Whether one likes this kind of thing is going to be a matter of taste in the end.一个人是否喜欢这种东西最终将是一个品味问题。
I found out the get method is much slower than checking if a key is in the dictionary, so if you have big data it would be faster to go with check for existing key in dictionary:我发现 get 方法比检查一个键是否在字典中要慢得多,所以如果你有大数据,它会更快到 go 检查字典中的现有键:
filtered_messages = [elem for elem in messages_all
if "client_msg_id" not in elem
and not ("type" in elem
and not ('subtype' in elem
and not (elem['subtype'] in ['channel_join', 'channel_leave']
or ('reply_users_count' in elem
and elem['reply_users_count'] == 2))))]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.