简体   繁体   English

如何用 python (熊猫)的另一列的值替换 None 的列值?

[英]How to replace a column value of None with a value from another column with python (pandas)?

I have a DataFrame where the first column represents a Color and the second represents the Description of an item.我有一个 DataFrame ,其中第一列代表颜色,第二列代表项目的描述。 Unfortunately, some of the information of Color column were combined into the Description column as you can see below:不幸的是, Color列的一些信息被合并到Description列中,如下所示:

data = {"Color": [None, "Red", "Blue", "Green", None],
        "Description": ["Red T-Shirt", "Skirt", "Pants", "Underwear", "Blue Cap"]}

df = pd.DataFrame(data)


| Color | Description |
|-------|-------------|
|None   |Red T-Shirt  |
|Blue   |Pants        |
|Green  |Underwear    |
|None   |Blue Cap     |

First I splitted the Description column on space with:首先,我将空间上的Description列拆分为:

df["Description"] = df["Description"].apply(lambda x: x.split(" "))

And what I wanted to do is replace None values on Color with first element of Description where Color is None .我想要做的是将Color上的None值替换为Description的第一个元素,其中Color is None The code I used was:我使用的代码是:

colors = ["Red", "Blue", "Green"]
df["Color"] = df["Color"].where(df["Color"] != None, df["Description"][0])
df["Color"] = df["Color"].apply(lambda x: x if x in colors else "Color N/A")

My code is returning follow information:我的代码返回以下信息:

| Color | Description      |
|-------|------------------|
|None   |["Red", "T-Shirt"]|
|Blue   |["Pants"]         |
|Green  |["Underwear"]     |
|None   |["Blue", "Cap"]   |

But should return:但应该返回:

| Color | Description      |
|-------|------------------|
|Red    |["Red", "T-Shirt"]|
|Blue   |["Pants"]         |
|Green  |["Underwear"]     |
|Blue   |["Blue", "Cap"]   |

Any idea which mistake I did?知道我犯了哪个错误吗?

Try this -尝试这个 -

Split the 2nd column based on space character, and then use np.where to fill the Null values in column 'Color'.根据空格字符拆分第二列,然后使用 np.where 填充“颜色”列中的 Null 值。

df['Description'] = df['Description'].str.split(' ')
df['Color'] = np.where(df['Color'].isna() , df['Description'].str[0], df['Color'])
print(df)

You can do with apply() on row by setting axis=1 .您可以通过设置axis=1在行上使用apply() Detail is return the first value of Description column if Color column value is None .如果Color列的值为None ,详细信息将返回Description列的第一个值。

df["Description"] = df["Description"].apply(lambda x: x.split(" "))

df['Color'] = df.apply(lambda row: row['Description'][0] if row['Color'] == None else row['Color'], axis=1)
print(df)

   Color     Description
0    Red  [Red, T-Shirt]
1    Red         [Skirt]
2   Blue         [Pants]
3  Green     [Underwear]
4   Blue     [Blue, Cap]

You can use loc function, where you can update a column based on any given condition您可以使用 loc function,您可以在其中根据任何给定条件更新列

data.loc[data['Color'].isnull(), 'color'] = data['Description'][0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM