[英]Slicing using list comprehension on a subset of a dataframe
我想運行一個列表理解,以在由其他列中的值定義的子集中的一列中按“-”對名稱進行切片。
所以在這種情況下:
category product_type name
0 pc unit hero-dominator
1 print unit md-ffx605
2 pc option keyboard1.x-963
我對“pc”類別和“單元”產品類型感興趣,所以我希望列表理解只將“名稱”列的第一行更改為這種形式:
category product_type name
0 pc unit dominator
1 print unit md-ffx605
2 pc option keyboard1.x-963
我試過這個:
df['name'].loc[df['product_type']=='unit'] = [x.split('-')[1] for x in df['name'].loc[df['product_type']=='unit']]
但我得到了“列表索引超出范圍”IndexError。
非常感謝任何幫助。
您可以通過以下方式解決問題,請關注評論並隨時提出問題:
編輯,現在我們認為“名稱”列中可能沒有字符串元素:
import pandas as pd
import numpy as np
def change(row):
if row["category"] == "pc" and row["product_type"] == "unit":
if type(row["name"]) is str: # check if element is string before split()
name_split = row["name"].split("-") # split element
if len(name_split) == 2: # it could be name which does not have "-" in it, check it here
return name_split[1] # if "-" was in name return second part of split result
return row["name"] # else return name without changes
return row["name"]
# create data frame:
df = pd.DataFrame(
{
"category": ["pc", "print", "pc", "pc", "pc", "pc"],
"product_type": ["unit", "unit", "option", "unit", "unit", "unit"],
"name": ["hero-dominator", "md-ffx605", "keyboard1.x-963", np.nan, 10.24, None]
}
)
df["name"] = df.apply(lambda row: change(row), axis=1) # change data frame here
print(df)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.