在 dataframe 的子集上使用列表理解进行切片

Question

I want to run a list comprehension to slice names by '-' in one column in a subset defined by values in other columns.我想运行一个列表理解，以在由其他列中的值定义的子集中的一列中按“-”对名称进行切片。

So in this case:所以在这种情况下：

    category   product_type   name 
0   pc         unit           hero-dominator
1   print      unit           md-ffx605
2   pc         option         keyboard1.x-963

I'm interested in the 'pc' category and 'unit' product type, so I want the list comprehension to only change the first row of the 'name' column to this form:我对“pc”类别和“单元”产品类型感兴趣，所以我希望列表理解只将“名称”列的第一行更改为这种形式：

    category   product_type   name 
0   pc         unit           dominator
1   print      unit           md-ffx605
2   pc         option         keyboard1.x-963

I tried this:我试过这个：

df['name'].loc[df['product_type']=='unit'] = [x.split('-')[1] for x in df['name'].loc[df['product_type']=='unit']]

But I'm getting the 'list index out of range' IndexError.但我得到了“列表索引超出范围”IndexError。

Any help much appreciated.非常感谢任何帮助。

Answer 1

You can solve the problem the following way, please follow comments and feel free to ask questions:您可以通过以下方式解决问题，请关注评论并随时提出问题：

Edit, now we consider that there could be not string elements in "name" column:编辑，现在我们认为“名称”列中可能没有字符串元素：

import pandas as pd
import numpy as np


def change(row):
    if row["category"] == "pc" and row["product_type"] == "unit":
        if type(row["name"]) is str:  # check if element is string before split()
            name_split = row["name"].split("-")  # split element
            if len(name_split) == 2:  # it could be name which does not have "-" in it, check it here
                return name_split[1]  # if "-" was in name return second part of split result
            return row["name"]  # else return name without changes

    return row["name"]


# create data frame:
df = pd.DataFrame(
    {
        "category": ["pc", "print", "pc", "pc", "pc", "pc"],
        "product_type": ["unit", "unit", "option", "unit", "unit", "unit"],
        "name": ["hero-dominator", "md-ffx605", "keyboard1.x-963", np.nan, 10.24, None]
    }
)


df["name"] = df.apply(lambda row: change(row), axis=1)  # change data frame here
print(df)

在 dataframe 的子集上使用列表理解进行切片

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-11-04 18:37:52

在 dataframe 的子集上使用列表理解进行切片

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-11-04 18:37:52

解决方案1
1 已采纳 2019-11-04 18:37:52