简体   繁体   English

使用带有布尔索引的列列表

[英]Using a list of columns with boolean indexing

From my analysis I have discovered that Disloyal 30-40 year old customers are Not Satisfied with Company X. "Not Satisfied" means they have rated services and products 0-2 out of a possible 5. I want to know what inputs were ranked <=2.根据我的分析,我发现Disloyal 30-40岁客户对 X 公司Not Satisfied 。“不满意”意味着他们对服务和产品的评分为 0-2(满分为 5 分)。我想知道哪些输入被评为 < =2。

I stored the columns in a list to use in a for loop so I could index the relevant column values which are rankings 0-5.我将列存储在一个列表中以在 for 循环中使用,以便我可以索引排名 0-5 的相关列值。

What is the syntax for using the column variable in the boolean expression?在布尔表达式中使用column变量的语法是什么?

Example Data:示例数据:

Customer Type    Age    Satisfaction    Design   Food    Wi-Fi    Service    Distance
     Disloyal     28   Not Satisfied         0      1        2          2        13.5
        Loyal     30       Satisfied         5      3        5          4        34.2
     Disloyal     36   Not Satisfied         2      0        2          4        55.8

Code代码

ranked_cols = ['Design', 'Food', 'Wi-Fi', 'Service', 'Distance']

for column in df[ranked_cols]:
    columnSeriesObj = df[column]

sub = df[
(df["Customer Type"] == "Disloyal")
& (df["Satisfaction"] == "Not Satisfied")
& df["Age"].between(30, 40)
]

sub[(sub[ranked_cols] <= 2)].shape[0]

(sub.melt(value_vars=[c for c in sub.columns if c.startswith(column)])
.groupby("variable")
.value_counts()
.to_frame()
.reset_index()
.rename(columns={0: "count"}))

Try this:尝试这个:

# Choose the cols you want to see the ratings for
ranked_cols = [
    "Design",
    "Food",
    "Wi-Fi",
    "Service",
]

# Select the relevant customers
sub = df[
    (df["Customer Type"] == "Disloyal")
    & (df["Satisfaction"] == "Not Satisfied")
    & df["Age"].between(30, 40)
]

(
    sub.melt(value_vars=ranked_cols)
    .groupby("variable")
    .value_counts()
    .to_frame()
    .reset_index()
    .rename(columns={"value": "rating", 0: "count"})
)

This will output a DataFrame contaning all the ranked_cols categories and their respective rating and how many times that rating was given ( count ):这将输出一个 DataFrame,其中包含所有ranked_cols类别及其各自的rating以及该评分被给出的次数( count ):

    variable  rating  count
0   Design    2       1
1   Food      0       1 
2   Service   4       1
3   Wi-Fi     2       1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM