简体   繁体   中英

How to replace a column value of None with a value from another column with python (pandas)?

I have a DataFrame where the first column represents a Color and the second represents the Description of an item. Unfortunately, some of the information of Color column were combined into the Description column as you can see below:

data = {"Color": [None, "Red", "Blue", "Green", None],
        "Description": ["Red T-Shirt", "Skirt", "Pants", "Underwear", "Blue Cap"]}

df = pd.DataFrame(data)


| Color | Description |
|-------|-------------|
|None   |Red T-Shirt  |
|Blue   |Pants        |
|Green  |Underwear    |
|None   |Blue Cap     |

First I splitted the Description column on space with:

df["Description"] = df["Description"].apply(lambda x: x.split(" "))

And what I wanted to do is replace None values on Color with first element of Description where Color is None . The code I used was:

colors = ["Red", "Blue", "Green"]
df["Color"] = df["Color"].where(df["Color"] != None, df["Description"][0])
df["Color"] = df["Color"].apply(lambda x: x if x in colors else "Color N/A")

My code is returning follow information:

| Color | Description      |
|-------|------------------|
|None   |["Red", "T-Shirt"]|
|Blue   |["Pants"]         |
|Green  |["Underwear"]     |
|None   |["Blue", "Cap"]   |

But should return:

| Color | Description      |
|-------|------------------|
|Red    |["Red", "T-Shirt"]|
|Blue   |["Pants"]         |
|Green  |["Underwear"]     |
|Blue   |["Blue", "Cap"]   |

Any idea which mistake I did?

Try this -

Split the 2nd column based on space character, and then use np.where to fill the Null values in column 'Color'.

df['Description'] = df['Description'].str.split(' ')
df['Color'] = np.where(df['Color'].isna() , df['Description'].str[0], df['Color'])
print(df)

You can do with apply() on row by setting axis=1 . Detail is return the first value of Description column if Color column value is None .

df["Description"] = df["Description"].apply(lambda x: x.split(" "))

df['Color'] = df.apply(lambda row: row['Description'][0] if row['Color'] == None else row['Color'], axis=1)
print(df)

   Color     Description
0    Red  [Red, T-Shirt]
1    Red         [Skirt]
2   Blue         [Pants]
3  Green     [Underwear]
4   Blue     [Blue, Cap]

You can use loc function, where you can update a column based on any given condition

data.loc[data['Color'].isnull(), 'color'] = data['Description'][0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM