[英]How to replace a column value of None with a value from another column with python (pandas)?
I have a DataFrame where the first column represents a Color and the second represents the Description of an item.我有一个 DataFrame ,其中第一列代表颜色,第二列代表项目的描述。 Unfortunately, some of the information of Color
column were combined into the Description
column as you can see below:不幸的是, Color
列的一些信息被合并到Description
列中,如下所示:
data = {"Color": [None, "Red", "Blue", "Green", None],
"Description": ["Red T-Shirt", "Skirt", "Pants", "Underwear", "Blue Cap"]}
df = pd.DataFrame(data)
| Color | Description |
|-------|-------------|
|None |Red T-Shirt |
|Blue |Pants |
|Green |Underwear |
|None |Blue Cap |
First I splitted the Description
column on space with:首先,我将空间上的Description
列拆分为:
df["Description"] = df["Description"].apply(lambda x: x.split(" "))
And what I wanted to do is replace None
values on Color
with first element of Description
where Color is None
.我想要做的是将Color
上的None
值替换为Description
的第一个元素,其中Color is None
。 The code I used was:我使用的代码是:
colors = ["Red", "Blue", "Green"]
df["Color"] = df["Color"].where(df["Color"] != None, df["Description"][0])
df["Color"] = df["Color"].apply(lambda x: x if x in colors else "Color N/A")
My code is returning follow information:我的代码返回以下信息:
| Color | Description |
|-------|------------------|
|None |["Red", "T-Shirt"]|
|Blue |["Pants"] |
|Green |["Underwear"] |
|None |["Blue", "Cap"] |
But should return:但应该返回:
| Color | Description |
|-------|------------------|
|Red |["Red", "T-Shirt"]|
|Blue |["Pants"] |
|Green |["Underwear"] |
|Blue |["Blue", "Cap"] |
Any idea which mistake I did?知道我犯了哪个错误吗?
Try this -尝试这个 -
Split the 2nd column based on space character, and then use np.where to fill the Null values in column 'Color'.根据空格字符拆分第二列,然后使用 np.where 填充“颜色”列中的 Null 值。
df['Description'] = df['Description'].str.split(' ')
df['Color'] = np.where(df['Color'].isna() , df['Description'].str[0], df['Color'])
print(df)
You can do with apply()
on row by setting axis=1
.您可以通过设置axis=1
在行上使用apply()
。 Detail is return the first value of Description
column if Color
column value is None
.如果Color
列的值为None
,详细信息将返回Description
列的第一个值。
df["Description"] = df["Description"].apply(lambda x: x.split(" "))
df['Color'] = df.apply(lambda row: row['Description'][0] if row['Color'] == None else row['Color'], axis=1)
print(df)
Color Description
0 Red [Red, T-Shirt]
1 Red [Skirt]
2 Blue [Pants]
3 Green [Underwear]
4 Blue [Blue, Cap]
You can use loc function, where you can update a column based on any given condition您可以使用 loc function,您可以在其中根据任何给定条件更新列
data.loc[data['Color'].isnull(), 'color'] = data['Description'][0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.