[英]Iterate through multiple Pandas list-type series and find matches
我有一個 Pandas DF,其中包含三個類似列表的系列,我需要對其進行迭代並與外部列表進行比較,然后為找到這些外部列表的完全匹配的行創建一個 True/NaN 系列
娛樂代碼:
data = {
"num_elements": [1,3,3,4],
"elements_bool_identifiers": [["Y"],["N", "Y"],["N"],["N"]],
"elements_identifiers": [["FOO"],["FOO", "BAR"],["FOOBAR"],["FOO", "BAZ"]],
"identifiers_selections": [["A"],["A", "B", "B"],["A", "B", "B"],["A", "B", "A"]],
}
df = pd.DataFrame(data)
valid_elements_bool_identifiers = "N"
valid_elements_identifiers = ["FOOBAR"] # (might be expanded in the future)
valid_identifiers_selections = ["A", "B", "B"]
列表系列同時是一個轉換集 (.apply(set).apply(list))
import pandas as pd
data = {
"num_elements": [1,3,3,4],
"elements_bool_identifiers": [["Y"],["N", "Y"],["N"],["N"]],
"elements_identifiers": [["FOO"],["FOO", "BAR"],["FOOBAR"],["FOO", "BAZ"]],
"identifiers_selections": [["A"],["A", "B", "B"],["A", "B", "B"],["A", "B", "A"]],
}
df = pd.DataFrame(data)
valid_elements_bool_identifiers = ("N",) # use tuple
valid_elements_identifiers = ("FOOBAR",) # use tuple
valid_identifiers_selections = ("A", "B", "B") # use tuple
df["match"] = ((df["elements_bool_identifiers"].map(tuple) == valid_elements_bool_identifiers) &
(df["elements_identifiers"].map(tuple) == valid_elements_identifiers) &
(df["identifiers_selections"].map(tuple) == valid_identifiers_selections))
df
否則使用列表會出錯:
('長度必須匹配才能比較', (4,), (3,))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.