简体   繁体   English

通过字典小写传递条件无法识别

[英]passing conditions through dictionary lowercase not recognized

In this sample data:在此样本数据中:

data = [{'source': ' Off-grid energy'},
 {'source': 'off-grid generation'},
 {'source': 'Off grid energy '},
 {'source': 'OFFGRID energy'},
 {'source': 'apple sauce'},
 {'source': 'green energy'},
 {'source': 'Green electricity '},
 {'source': 'tomato  sauce'},
 {'source': 'BIOMASS as an energy source'},
 {'source': 'produced heat (biogas).'}]

I want to create a new column based on conditions:我想根据条件创建一个新列:

my_conditions = {
    "green": df["source"].str.contains("green"),
    "bio-gen": df["source"].str.contains("bio"),
    "off-grid": df["source"].str.contains("off-grid")
}

I preprocess by lowercasing df["source"]:我通过小写 df["source"] 进行预处理:

df['source'] = df["source"].str.lower()

Then using Numpy's select:然后使用 Numpy 的 select:

df['category-lower'] = np.select(my_conditions.values(),\
                           my_conditions.keys(),\
                           default="other")

I can't figure out why the lowercasing is not recognized (see row 0, 6, 8)我无法弄清楚为什么无法识别小写字母(请参阅第 0、6、8 行)

在此处输入图像描述

You've probably applied .str.lower() after the my_condition was constructed.您可能在构造 my_condition 之后应用了my_condition .str.lower() Try instead:尝试改为:

import re

# apply .str.lower() here, or use flags=re.I (ignorecase in .str.contains)
# df['source'] = df["source"].str.lower() 

my_conditions = {
    "green": df["source"].str.contains("green", flags=re.I),
    "bio-gen": df["source"].str.contains("bio", flags=re.I),
    "off-grid": df["source"].str.contains("off-grid", flags=re.I),
}

df["category-lower"] = np.select(
    my_conditions.values(), my_conditions.keys(), default="other"
)

print(df)

Prints:印刷:

                        source category-lower
0              Off-grid energy       off-grid
1          off-grid generation       off-grid
2             Off grid energy           other
3               OFFGRID energy          other
4                  apple sauce          other
5                 green energy          green
6           Green electricity           green
7                tomato  sauce          other
8  BIOMASS as an energy source        bio-gen
9      produced heat (biogas).        bio-gen

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM