繁体   English   中英

如何检查 dataframe 中是否存在列表元素?

[英]How to check that list elements exists in the dataframe?

我的 dataframe 中有列,其中包含不同长度的字符串列表,如下所示:

           names                                            venue
 
[Instagrammable, Restaurants, Vegan]                          14 Hills
[Date Night, Vibes, Drinks]                                   Upper 14
[Date Night, Drinks, After Work Drinks, Cocktail]             Hills
            .                                                   .                  
            .                                                   .
            .

现在,如果我想检查我的 dataframe 中是否存在某些列表,该怎么做。

Example1:

Input :
        find_list=[Date Night, Vibes, Drinks]
        venue = 'Upper 14'
Output:
        Record is present in my dataframe

Example 2:

Input :
        find_list=[Date Night, Drinks]
        venue='Hills 123'
Output:
        Record is not present in my dataframe

例子

Input :
        find_list=[   Date Night, Vibes, Drinks]
        venue = 'Upper 14'
Output:
        Record is not present in my dataframe

您可以使用.apply().any()

find_list = ["Date Night", "Vibes", "Drinks"]

if df["names"].apply(lambda x: x == find_list).any():
    print("List is present in my dataframe")
else:
    print("List is not present in my dataframe")

印刷:

List is present in my dataframe

编辑:要匹配记录:

find_list = ["Date Night", "Vibes", "Drinks"]
venue = "Upper 14"

if df.apply(
    lambda x: x["names"] == find_list and x["venue"] == venue, axis=1
).any():
    print("Record is present in my dataframe")
else:
    print("Record is not present in my dataframe")

印刷:

Record is present in my dataframe

编辑 2:从输入列表中去除空格:

find_list = ["      Date Night", "Vibes", "Drinks"]
venue = "Upper 14"

if df.apply(
    lambda x: all(a.strip() == b.strip() for a, b in zip(x["names"], find_list))
    and x["venue"] == venue,
    axis=1,
).any():
    print("Record is present in my dataframe")
else:
    print("Record is not present in my dataframe")

印刷:

Record is present in my dataframe

编辑 3:删除单词之间的多余空格:

import re

find_list = ["      Date     Night", "Vibes", "Drinks"]
venue = "Upper 14"

r = re.compile(r"\s{2,}")

if df.apply(
    lambda x: all(
        r.sub(a.strip(), " ") == r.sub(b.strip(), " ")
        for a, b in zip(x["names"], find_list)
    )
    and x["venue"] == venue,
    axis=1,
).any():
    print("Record is present in my dataframe")
else:
    print("Record is not present in my dataframe")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM